Current Semester: Fall 2023
A Data Science and Statistical Approach to Programming Eng 10200 [Fall 2023]
Class Website: [link]
Class Description:
Introduce the basic ideas of programming as needed to demo data science for engineering. Includes basics of the python language and ideas of programming while going through a basic workflow of reading in data basic analysis and visualization. Some basic ideas of probability and statistics will also be introduced from a computational rather than theoretical approach. No previous programming experience is required.
This course will present a survey to Data Science and introduce some of the core data science tools. While some programming experience is required for the course, the course will include a rapid introduction to Data Science programming and the stack of tools needed to process, visualize and analyze data stack with a language such as R or Python. Students will be given a high-level survey of data engineering, visual analytics, applied statistics, machine learning, and big data. The course will illustrate this bringing them through real data sets and case studies.
Past Semesters
Semester: Spring 2023
Applied Machine Learning and Data Mining, DSE I2100 [Spring 2023]
Class Website: [link]
Class Description:
Introduction to machine learning, data mining, and statistical pattern recognition. Topics include:
- Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks, deep learning),
- Unsupervised learning (clustering, non-parametric techniques, dimensionality reduction);
- Best practices in machine learning (bias/variance theory, model selection and evaluation, resampling).
In this class, you will learn about the most effective machine learning techniques, and gain practice implementing them and getting them to work for yourself. More importantly, you'll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems.
Deep Neural Networks and Applications with Tensorflow, CSC I1910 [Spring 2023]
Class Website: [link]
Class Description:
This course will introduce deep neural networks, the main kinds of architectures, explore some applications, and use Python and Tensorflow 2.0. The course will assume some familiarity with programming in Python, probability, and statistics, linear algebra, and calculus. It will also assume some familiarity with machine learning and/or artificial intelligence although this material will be briefly reviewed. Tentative topics will include
- Review of Machine Learning with Python, Pandas, Sklearn and Tensorflow 2.0
- Multi-layer Neural Networks
- Convolutional Neural Networks
- Sequence models (eg. Recurrent Neural Networks)
- Generative Models
A Data Science and Statistical Approach to Programming Eng 10200 [Fall 2022]
Class Website: [link]
Class Description:
Introduce the basic ideas of programming as needed to demo data science for engineering. Includes basics of the python language and ideas of programming while going through a basic workflow of reading in data basic analysis and visualization. Some basic ideas of probability and statistics will also be introduced from a computational rather than theoretical approach. No previous programming experience is required.
This course will present a survey to Data Science and introduce some of the core data science tools. While some programming experience is required for the course, the course will include a rapid introduction to Data Science programming and the stack of tools needed to process, visualize and analyze data stack with a language such as R or Python. Students will be given a high-level survey of data engineering, visual analytics, applied statistics, machine learning, and big data. The course will illustrate this bringing them through real data sets and case studies.
Applied Machine Learning and Data Mining, DSE I2100 [Spring 2022]
Class Website: [link]
Class Description:
Introduction to machine learning, data mining, and statistical pattern recognition. Topics include:
- Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks, deep learning),
- Unsupervised learning (clustering, non-parametric techniques, dimensionality reduction);
- Best practices in machine learning (bias/variance theory, model selection and evaluation, resampling).
In this class, you will learn about the most effective machine learning techniques, and gain practice implementing them and getting them to work for yourself. More importantly, you'll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems.
Machine Learning for Finance and Trading, DSE G2200 [Spring 2022]
Class Website: [link]
Class Description:
Machine learning (ML) has become an essential tool in finance and trading by automating the process to identify patterns and providing more accurate estimates of risk and valuation. Topics of this course will include the acquisition of financial data, working with time series, computing well-known technical features, visualization, trading signals, backtesting, training and evaluating ML models, supervised learning. Additional topics may include unsupervised and reinforcement learning, deep learning models, and working with different data sources such as sentiment. This course uses Python and Python libraries used in the industry for analysis and visualization. The course assumes a college/master's level statistics course, with some knowledge of computer programming. Some knowledge of python, NumPy, and pandas, also, basic familiarity with machine learning such as classification and logistic regression are helpful. Data and examples use both traditional and crypto-asset markets.
This course will present a survey to Data Science and introduce some of the core data science tools. While some programming experience is required for the course, the course will include a rapid introduction to Data Science programming and the stack of tools needed to process, visualize and analyze data stack with a language such as R or Python. Students will be given a high-level survey of data engineering, visual analytics, applied statistics, machine learning, and big data. The course will illustrate this bringing them through real data sets and case studies.
A Data Science and Statistical Approach to Programming Eng 10200 [Fall 2021]
Class Website: [link]
Class Description:
Introduce the basic ideas of programming as needed to demo data science for engineering. Includes basics of the python language and ideas of programming while going through a basic workflow of reading in data basic analysis and visualization. Some basic ideas of probability and statistics will also be introduced from a computational rather than theoretical approach. No previous programming experience is required.
Applied Machine Learning and Data Mining, DSE I2100 [Spring 2021]
Class Website: [link]
Class Description:
Introduction to machine learning, data mining, and statistical pattern recognition. Topics include:
- Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks, deep learning),
- Unsupervised learning (clustering, non-parametric techniques, dimensionality reduction);
- Best practices in machine learning (bias/variance theory, model selection and evaluation, resampling).
In this class, you will learn about the most effective machine learning techniques, and gain practice implementing them and getting them to work for yourself. More importantly, you'll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems.
Deep Neural Networks and Applications with Tensorflow, CSC I1910 [Spring 2021]
Class Website: [link]
Class Description:
This course will introduce deep neural networks, the main kinds of architectures, explore some applications, and use Python and Tensorflow 2.0. The course will assume some familiarity with programming in Python, probability, and statistics, linear algebra, and calculus. It will also assume some familiarity with machine learning and/or artificial intelligence although this material will be briefly reviewed. Tentative topics will include
- Review of Machine Learning with Python, Pandas, Sklearn and Tensorflow 2.0
- Multi-layer Neural Networks
- Convolutional Neural Networks
- Sequence models (eg. Recurrent Neural Networks)
- Generative Models
This course will present a survey to Data Science and introduce some of the core data science tools. While some programming experience is required for the course, the course will include a rapid introduction to Data Science programming and the stack of tools needed to process, visualize and analyze data stack with a language such as R or Python. Students will be given a high-level survey of data engineering, visual analytics, applied statistics, machine learning, and big data. The course will illustrate this bringing them through real data sets and case studies.
A Data Science and Statistical Approach to Programming Eng 19999 (future Eng 10200)[Fall 2020]
Class Website: [link]
Class Description:
Introduce the basic ideas of programming as needed to demo data science for engineering. Includes basics of the python language and ideas of programming while going through a basic workflow of reading in data basic analysis and visualization. Some basic ideas of probability and statistics will also be introduced from a computational rather than theoretical approach. No previous programming experience is required.
Deep Neural Networks and Applications with Tensorflow CSC I1910 (CCNY), C SC 84200 (GC) [Spring 2020]
Class Website: [link]
Class Description:
This course will introduce deep neural networks, the main kinds of architectures, explore some applications, and use Python and Tensorflow 2.0. The course will assume some familiarity with programming in Python, probability, and statistics, linear algebra, and calculus. It will also assume some familiarity with machine learning and/or artificial intelligence although this material will be briefly reviewed. Tentative topics will include
- Review of Machine Learning with Python, Pandas, Sklearn and Tensorflow 2.0
- Multi-layer Neural Networks
- Convolutional Neural Networks
- Sequence models (eg. Recurrent Neural Networks)
- Generative Models
Assessment will be based on homework exercises, student-developed tutorials, a midterm and a group project. Ph.D. students will also be expected to develop a 10-minute video reviewing a paper with novel code applying the technique.
Applied Machine Learning and Data Mining DSE I2100 [Spring 2020/Spring 2019]
Class Website: [link]
Class Description:
Introduction to machine learning, data mining, and statistical pattern recognition. Topics include:
- Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks, deep learning),
- Unsupervised learning (clustering, non-parametric techniques, dimensionality reduction);
- Best practices in machine learning (bias/variance theory, model selection and evaluation, resampling).
In this class, you will learn about the most effective machine learning techniques, and gain practice implementing them and getting them to work for yourself. More importantly, you'll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems.
Web Development (aka. Web Site Design) CSc 473 [Fall 2019/Spring 2019]
Class Description:
The design and implementation of web sites and web applications. This course will focus on foundational tools in "full-stack" web development. You will be on building a real or at least realistic web solution to solve a "business-problem". There will be an emphasis on testing, working in a small team and software engineering best practices.
Introduction to Data Science, DSE I1020 [Fall 2019/Fall 2018]
Class Description:
This course will present a survey to Data Science and introduce some of the core data science tools. While some programming experience is required for the course, the course will include a rapid introduction to Data Science programming and the stack of tools needed to process, visualize and analyze data stack with a language such as R or Python. Students will be given a high-level survey of data engineering, visual analytics, applied statistics, machine learning, and big data. The course will illustrate this bringing them through real data sets and case studies.
Generative Adversarial Networks Senior Project Capstone Part II, CSc 59867 [Fall 2018]
Class Description:
In this capstone course, the students will develop skills needed to work with the latest methods in machine learning: Deep and Generative Adversarial Networks. While most Deep Neural Networks (DNN)s have been used for classification or regression, Generative Adversarial Networks (GAN)s are capable of sampling from a probability distribution modeled on a training data set. This means given a large set of training data, we can attempt to generate more data similar to the original. For example, if we have a database of face images we could generate more face images; that is the network would synthesize entirely new faces. Moreover, exciting work has been done which makes it possible to change the attribute of an image. For instance, it is possible to transmute a summer landscape to a winter landscape, or a horse to a zebra and back. This methodology goes way beyond images. It has been applied to voice synthesis and synthesis of 3D models. The goal of the projects will be to apply some of these methods a range of other problems in different domains where ample data is available. This first part of the capstone will focus on building experience with the Machine Learning framework. Students will also need to pick up critical python libraries such as numpy, scipy, pandas, matplotlib, sklearn, and pytorch. In addition, statistical concepts will be reviewed.
This is the second in a two-semester capstone course. The student is required to complete a significant project in computer science or engineering under the mentorship of a faculty member. In addition to technical material required for successful completion of a specific project, topics include identification of a problem, background research, social, ethical and economic considerations, intellectual property and patents and proposal writing, including methods of analysis and theoretical modeling. A detailed project proposal is formulated in the first semester, and the project is completed in the second semester. Each student is required to write an in-depth report, and to make an oral presentation to the faculty. Senior year students only, or permission of the department.
Generative Adversarial Networks Senior Project Capstone Part I, CSc 59866 [Spring 2018]
Class Description:
In this capstone course, the students will develop skills needed to work with the latest methods in machine learning: Deep and Generative Adversarial Networks. While most Deep Neural Networks (DNN)s have been used for classification or regression, Generative Adversarial Networks (GAN)s are capable of sampling from a probability distribution modeled on a training data set. This means given a large set of training data, we can attempt to generate more data similar to the original. For example, if we have a database of face images we could generate more face images; that is the network would synthesize entirely new faces. Moreover, exciting work has been done which makes it possible to change the attribute of an image. For instance, it is possible to transmute a summer landscape to a winter landscape, or a horse to a zebra and back. This methodology goes way beyond images. It has been applied to voice synthesis and synthesis of 3D models. The goal of the projects will be to apply some of these methods a range of other problems in different domains where ample data is available. This first part of the capstone will focus on building experience with the Machine Learning framework. Students will also need to pick up critical python libraries such as numpy, scipy, pandas, matplotlib, sklearn, and pytorch. In addition, statistical concepts will be reviewed.
This is the first in a two-semester capstone course. The student is required to complete a significant project in computer science or engineering under the mentorship of a faculty member. In addition to technical material required for successful completion of a specific project, topics include identification of a problem, background research, social, ethical and economic considerations, intellectual property and patents and proposal writing, including methods of analysis and theoretical modeling. A detailed project proposal is formulated in the first semester, and the project is completed in the second semester. Each student is required to write an in-depth report, and to make an oral presentation to the faculty. Senior year students only, or permission of the department.
Topics in Front End Web Application Development, CSC 59940 [Spring 2018](mentor-teaching)
Class Website: [link]
Instructors: David Moon, Michelle Shu, TA: Ricardo Rodriguez
Class Description:
This course will teach the basic design and implementation of a web application, with an emphasis on front-end development using Facebook’s React framework. By the end of the course, you should have a basic toolkit to build fully functional web applications for hackathons, personal projects, or freelance work.
The course will provide an overview of component based web frameworks combining HTML, CSS, JavaScript, and Firebase backend-as-a-service. It will also cover industry best practices around agile development, version control, use of build tools, and designing for optimal user experience.
The design and implementation of web sites and web applications. This course will focus on foundational tools in "full-stack" web development. You will be on building a real or at least realistic web solution to solve a "business-problem". There will be an emphasis on testing, working in a small team and software engineering best practices.
more...Why do you need to know this?
Full stack web development is a critical skill. One shouldn’t think of web technologies as being “for the web” but rather general purpose software development skills for a range of applications. Most mobile and desktop applications use web technologies for communication with remote servers. More and more user interface development, even for desktops, now exploits web technology such as HTML5, CSS, and JavaScript. Even server to server communication often uses web APIs to enforce modularity and access from many different platforms. You should be able to stand up an application using a cloud service such as provided by Amazon, Google, Microsoft-Azure, Rackspace or Heroku. Once you can build a basic application from database communication to user interface and deploy it you have some software development superpowers. You can take a almost any new idea, such as for a new social app, marketplace or multi-player game, build, deploy and distribute it with very little resource or investment besides you imagination, your time and your sweat.
What technology will I learn/use?
The course uses HTML5, CSS, JavaScript, and Python for server-side programming. Initially we use the python micro-framework “Flask”. Flask is a light “pay for what you eat” framework providing routing and (server-side) templates without much ceremony. As we begin projects most projects use Django as it provides nearly every basic service and component a web application would need (at the cost of some learning curve).
In addition, you will be expected to use software engineering/development best practices. Your code must pass code linting, for example, using pylint. You will need to write pure unit-tests using mocking, integration tests, using the framework (Django/Flask) and acceptance tests using Selenium or something equivalent. You will need to write developer and project documentation. You and your team will need to track project issues and maintain the code using a collection of forks of code repositories using a distributed version control system (e.g. git or mercurial). Individual grades on group projects are determined both by the overall project quality, as well as individual contribution via code commits, and project management as visible from the code repository. Weekly status summaries on project progress become an important part of the second half of the course.
Due to time limitations there is not enough time to delve into rich user interface frameworks like React or AngularJS. We cover core JavaScript and Ajax, and most projects use a frontend framework such as Twitter-Bootstrap or Foundation, touch on foundations of design, use of color, font and accessibility.
Why can’t I just use PHP/Go/Scala-Play/Node.js/.net/Ruby on Rails/etc.?
I am evaluating a whole package of technologies that work together, from testing, to documentation, from database to front end. There isn’t even enough time in the course to teach all you need to know with the set of technologies I have chosen. Certainly for the purposes of instruction, the choices must to be limited. Because we have group projects, often with students who have limited web development experience, it would not be reasonable to ask them to learn a whole new set of tools unsupported by the initial part of the course. That said if a team can show that they have every part of the software support in place, e.g. automated testing, documentation, linting, and a full featured web framework, I will consider the argument.
For server-side technology, there is much more room for debate. It is hard to establish solid market share numbers but server-side Ruby on Rails, Python, and NodeJS tend to be the most popular for newer projects. PHP is on the decline and while many important mature frameworks such as WordPress and Drupal are written in PHP, PHP presents challenges for software engineering best practices. Ruby like python is a good teaching language, has excellent full featured frameworks like Rails and light ones like Sinatra, robust testing packages and is probably slightly more popular and mature than python as a server-side technology. Python has the advantage of being more broadly used outside of server-side web development, providing synergy with data science, integration with internet of things, and system administration. Moreover, it is more common that students come with some knowledge of python than with Ruby.
More recently the rise of JavaScript server-side technology, based on Node.js, has lead me to consider it for use every semester. Unfortunately, JavaScript is so badly fragmented and so intensely in flux that it becomes very challenging to settle on a stable set of choices. There is even great debate on what “best practice” means for basic JavaScript programming. On one-hand TypeScript and ES7 introduce types and traditional classes found in most other object oriented languages, addressing criticisms of JS by reducing boiler plate code and adding type safety. On the other hand fans of functional programming argue much of this as wrong-headed and making the mistakes of other languages such as Java or C++. The server-side frameworks such as ExpressJS or Meteor do not yet provide the full set of features that Django or Ruby on Rails currently do. The JavaScript tooling remains in flux with battles between Grunt vs. Gulp or Jasmine vs. Mocha raging on.
The choice of JavaScript for the client side technology is not much of a choice. For the moment, the client-side wars are over and JavaScript has prevailed. Moreover, the capabilities of client-side JavaScript become more impressive each week. With a solid knowledge of Python, Django/Flask on the backend, using (usually) a Postgresql database, and HTML5, CSS3 and JavaScript/JQuery on the front-end, students are reasonably well equipped for to build a wide range of application.
The subject matter of this course will be similar to CSc 59969. However, students will be required to read and present research papers from the field, and there will be guest speakers from the data visualization field. There will be some hands on instructions and lectures as well as a group project.
The exciting innovations in AI/Machine Learning/Data Science and Cryptocurrencies (like BitCoin) hold the promise of revolutionizing the Financial Industry. Banks and financial firms in New York and around the world are scrambling for talent that can help drive this revolution. more...
The exciting innovations in AI/Machine Learning/Data Science and Cryptocurrencies (like BitCoin) hold the promise of revolutionizing the Financial Industry. Banks and financial firms in New York and around the world are scrambling for talent that can help drive this revolution. This course will provide students with a guide to some of the tools at the core of these innovations, and help students see how these tools can be applied to the real world business problems firms are looking to solve.
The framework of the course will include both engineers and business students organized in teams take them through the processes of starting a FinTech business venture. Using Lean Launchpad and experiential learning methodologies, students develop and test business models that can be used to chart out road maps for developing startups into actual businesses. In addition to introducing students to the business fundamentals of FinTech and an introduction to cutting edge software technologies, guests from financial firms such as Standard Charter Bank, and technology companies like Google will speak through a partnership with the Zahn Innovation Center. Thus the course provides a unique opportunity to make valuable career contacts. This course is counted to CS technical elective, area B or C. The CPE/CS prerequisites are CSc 22100 and CSc 30100.
One of the fastest growing job opportunities is that of "Data Scientist." The future belongs to the companies and people that turn data into successful products. more...
The revolution resulting from the vast amounts data coming from measurement and transactions on an unprecedented scale, is transforming not only business but also science, and engineering as well. Interpreting and making data useful depends on critically on visualization methods. Visualization of data is important both for exploring the data, and for crafting a story with data to convince your audience of you results.
This course will give an overview of data visualization as well as the overlapping fields of information and scientific visualization. Students will learn to programmatically process and analyze data with Python libraries widely used in statistics, engineering, science and finance. We will cover the design of effective visualizations. Students will learn to build data visualizations directly using matplotlib (Python) and interactive web-based visualizations using JavaScript and D3. Project groups of students will each propose, design and build a visualization of a data set. The course requires students have programming experience such as CSc 102/103 or equivalent.
The goals of the course are for students to:
- Recognize the appropriate applications and value of visualizations
- Critically evaluate visualizations and suggest improvements and refinements
- Apply a structured design process to create effective visualizations
- Use programmatic tools to scrape, clean, and process data
- Use principles of human perception and cognition in visualization design
- Use visualization tools to explore data
- Create web-based interactive visualizations
- Use statistical tools to aid visualization of data
In this course will talk mostly about web applications and web services as well other topics in building web based internet applications. It begins with a lecture component.
more...In lectures I will discuss:
- - History of web technologies
- - Python client
- - Test frameworks
- - Web services REST
- - We will study Restful APIs such as Twitter, Google, Amazon web services
- - Web frameworks for building web services eg. Flask, Django
- - HTML5/JavaScript/CSS for consuming data
These topics will be covered in much less depth and more cursory than in the undergraduate course CSc 473. You will be expected to pick up what you need more rapidly.
While the course will start off with some lectures and exercises initially presenting some foundations it will then incorporate a seminar-like component where students will be expected to present focused expositions on the latest developments in web technologies. These topics will be negotiated with the student. In the past topics have included No-SQL databases such as MongoDB, Redis, and Neo4J, JavaScript front-end frameworks like AngularJS and React, APIs such as Twillio, Sound-Cloud, and map-services such as CartoDB. Groups of 3-4 students will build non-trivial web service projects with or without front-ends. For project technology, see the CSc 473 course, however projects in the course often use more bleeding edge technology instead.
This course teaches both the theory and practice of software engineering and object oriented design. We discuss the classical waterfall model and its variants such as iterative waterfall, spiral and the rational unified process.
more...This is contrasted with Agile methodology and we focus on Scrum and extreme programing techniques as representative of some of these ideas. In order for students to understand the concretely the software engineering process they will design and implement a software development project using python using agile and iterative methodology, applying best practices. These best practices include use of distributed version control (git, hg), software testing including TDD and BDD, agile style user stories and user profiles, and issue tracking. Classical UML diagraming and software estimation will also be presented.
Why do I need to program?
It is possible to manage programmers and understanding software engineering best practices without being an elite programmer. It is very difficulty to do so without some hands on experience. This hands on experience will be provided by using a scrum process to build a basic programming project with the Python language.
Python is one of the easiest programming languages to learn, running on nearly all platforms. It has become the most popular language, nation wide, for intro to programming courses, and is frequently taught in middle and high schools. Moreover, there is extensive online material for learning and application. The language is multi-paradigm supporting imperative, procedural and object oriented programming and it is even possible to program functionally in python, although not as naturally. Also, with python, there are well established and popular libraries and scripts for unit, integration and system testing, documentation, profiling and linting as well as agreed best practice coding style.