Course Overview:
In this course, you will gain practical foundation level training that enables immediate and effective participation in big data and other analytics projects. Learn ways of storing data that allow for efficient processing and analysis, and gain the skills you need to store, manage, process, and analyze massive amounts of unstructured data to create an appropriate data lake. Data science can be defined as a blend of mathematics, business acumen, tools, algorithms, and machine learning techniques, all of which help us in finding out the hidden insights or patterns from raw data which can be of major use in the formation of big business decisions
Course Objectives:
- Immediately participate as a data science team member
- Work with large data sets and generate insights
- Build predictive and classification models
- Manage a data analytics project through the entire lifecycle
Target Audience:
- Managers of business intelligence, analytics, and big data professionals’ teams
- Current business and data analysts looking to add big data analytics to their skills
- Data and database professionals looking to exploit their analytic skills in a big data environment
- Recent college graduates and graduate students with academic experience in a related discipline looking to move into the world of Data Science and big data
- Individuals looking to take the Data Scientist Associate (EMCDSA) certification
Pre-requisites:
- Strong quantitative background with a solid understanding of basic statistics, as would be found in a statistics 101 level course
- Experience with a scripting language such as Java, Perl, or Python (or R). Many of the lab examples taught in the course use R (with an RStudio GUI), which is an open-source statistical tool and programming
- Experience with SQL
Course Duration:
- 5 Days ( 35 Hours )
Course Content:
Introduction to Big Data analytics
- Big Data and its characteristics Lesson
- Business value from Big Data
- Data scientist
Data Analytics Lifecycle
- Data analytics lifecycle overview
- Discovery phase
- Data preparation phase
- Model planning phase
- Model building phase
- Communicate results phase
- Operationalize phase
Basic data analytics methods using R
- Introduction to the R programming language
- Analyzing and exploring data
- Statistics for model building and evaluation
Advanced analytics theory and methods
- Introduction to advanced analytics—theory, and methods
- K-means clustering
- Association rules
- Linear regression
- Logistic regression
- Text analysis
- Naïve Bayes
- Decision trees
- Time series analysis
Advanced analytics—technology and tools
- Introduction to advanced analytics—technology and tools
- Hadoop ecosystem
- In-database analytics SQL essentials
- Advanced SQL and MADlib
Putting it all together
- Preparing to operationalize
- Preparing project presentations
- Data visualization techniques
- Lab exercise on Data Big Analytics
- Q & A
- Closing Remarks
Course Customization Options
To request a customized training for this course, please contact us to arrange.