3 Months (Weekends Only)
Data Sciences (Big Data)
This comprehensive course provides in-depth knowledge of data science with a focus on big data technologies. Led by seasoned professional software engineers, participants will gain hands-on experience, theoretical knowledge, and practical skills required to become proficient data scientists capable of handling and analyzing large datasets. The course includes practical exercises, data analysis projects, and a Live Project throughout the training plan.
- Understanding data science and its applications
- Role of data scientists in the industry
- Setting up the data science environment
- Introduction to Python for data science
- Data collection and acquisition techniques
- Data cleaning and preprocessing
- Handling missing data and outliers
- Hands-on: Data cleaning and preprocessing with Python
- EDA techniques and visualization
- Statistical analysis and summary statistics
- Data visualization libraries (e.g., Matplotlib, Seaborn)
- Hands-on: Data visualization and EDA with Python
- Introduction to machine learning concepts
- Supervised and unsupervised learning
- Model evaluation and validation
- Hands-on: Building machine learning models with Python
- Introduction to big data and its challenges
- Hadoop and the Hadoop Distributed File System (HDFS)
- Apache Spark for big data processing
- Hands-on: Setting up Hadoop and Spark clusters
- Spark RDDs and DataFrames
- Spark transformations and actions
- Spark SQL for data querying
- Hands-on: Data processing with Spark
- Spark MLlib for machine learning
- Building and evaluating machine learning models in Spark
- Scalable machine learning with Spark
- Hands-on: Machine learning with Spark
- Introduction to NoSQL databases (e.g., MongoDB)
- Data storage in distributed systems
- Data warehousing and data lakes
- Hands-on: Working with NoSQL databases
- Introduction to deep learning
- Neural network architectures
- Deep learning frameworks (e.g., TensorFlow, Keras)
- Hands-on: Deep learning with Python
- Stream processing and real-time analytics
- Apache Kafka for data streaming
- Real-time analytics with Spark Streaming
- Hands-on: Real-time data processing with Kafka and Spark
- Natural Language Processing (NLP)
- Image and text analysis
- Time series analysis and forecasting
- Hands-on: Advanced data science projects
- Project kick-off and problem definition
- Data acquisition and preprocessing for the Live Project
- Building machine learning models on large datasets
- Real-time data processing and analytics
- Project presentation and demonstration
- Graduation and certificate distribution
Assessment:
- Weekly practical exercises and projects
- Final project evaluation and presentation
Certification:
Upon successful completion of the course and the Live Project, participants will receive a certificate in “Data Sciences (Big Data)” from “Industry Professionals.”
This course outline offers a structured and hands-on approach to data science with a focus on big data technologies, guided by experienced industry professionals, and emphasizes practical skills through exercises, data analysis projects, and a Live Project throughout the three-month program. Students will gain both theoretical knowledge and practical experience in data sciences and big data analytics.
Mamoona Riaz
Lead, AI & ML Engineer
- Employers: Arbisoft, Datics AI and C-SALT
- Clients: edX
- Experience: Above five years
- Masters Degree (Data Sciences and AI)