Big Data Certificate Courses

The Big Data certificate program at the San José State University (SJSU) School of Information (iSchool) includes three courses (nine units). The courses are delivered fully online and offered in a set sequence that affords scheduling flexibility for busy professionals. Students who successfully complete the required courses with a 3.0 GPA receive a transcript and Certificate in Big Data from SJSU.

Required Courses
Course Sequencing
Program Learning Outcomes

Course Prerequisites

Applicants with a Bachelor's degree in Computer Science, Computer Engineering, Information Science or related areas should either have taken courses or had practical experience in database management systems and programming/programming languages (Unix/Linux, C++ or Java, Python, R).

Data ScienceApplicants with bachelor's degrees in other areas should have practical experience in software development, database management, algorithm design, and programming languages (Unix/Linux, C++ or Java, Python, R)

In addition, students are encouraged (but not required) to have knowledge in cloud computing and website/mobile application services.

Required Courses

SJSU iSchool Big Data certificate students complete the following three courses (nine units):

INFO 208 Big Data Technologies

Instructor: Glen Mules, Ph.D., Data Scientist, Adjunct Professor, and former Senior Instructor for IBM

Provides an introduction to the technological ecosystem of Big Data and Hadoop, as well as introduces the participants to the goals and work of Data Science. View Sample Syllabus

Upon successful completion of this course, students will be able to achieve the following Course Learning Outcomes:

  • Be able to frame Big Data questions, formulate a strategy, and identify applicable technologies and techniques, form a data team, and understand the ethical considerations and risks of Big Data.
  • Be able to choose, use, and optimize technologies for loading large-scale data from disparate sources into a big data store, for integrating these heterogeneous datasets as well as for big data searching, querying, and analytical processing.
  • Be able to analyze, display, communicate, and interpret massive amounts of abstract data effectively and efficiently via visual representations.
  • Be able to choose, use, combine, and evaluate techniques and technologies appropriate for different big data mining tasks, including supervised and unsupervised learning, link analysis, and recommendation systems.

INFO 209 Web and Data Mining

Instructor: TBD

The main focus of this course is on data mining techniques and their applications to web mining. These include techniques applied to users and web content classification and clustering, algorithms for web recommendation systems, and web search and social network analysis such as information retrieval and link analysis. Technical material will be covered with a focus on Big Data and scalability utilizing available open source data mining and Hadoop based tools. Covers topics ranging from data mining, web mining, classification, numeric prediction, association rules, sequential patterns, web crawling, retrieval and search engines, social network analysis, link analysis, ranking, web personalization and recommender systems, to Hadoop based technologies, including MapReduce, Spark, MLlib. View Sample Syllabus

Upon successful completion of this course, students will be able to achieve the following Course Learning Outcomes:

  • Apply the fundamental data mining concepts and techniques (regression, classification, association learning, collaborative filtering and clustering).
  • Assess the model quality in terms of relevant error metrics for each task and potential cost associated.
  • Apply the fundamental web mining concepts and techniques (search engines indexing, and web content ranking, retrieval, recommender systems and personalized web services).
  • Develop social network analysis utilizing text mining and sentiment analysis workflow.
  • Perform Graph Analytics applied in order to model, store, retrieve, and perform analyses on graph-structured data (centrality, prestige, PageRank and HITS algorithms).
  • Demonstrate and evaluate very large data processing using scalable machine learning tools (MapReduce and Spark).

INFO 246 Information Visualization

Large-scale data analysis and table/graph design; perceptual properties; multivariate visual representations; visualization design principles; storytelling with visualization; user interactions; graphs and networks visualization; hierarchies and trees visualization; time series visualization; social visualization; visual analytics. View Sample Syllabus *Note that this sample syllabus from INFO 246 is from our MLIS program, and is only offered as a guide. (The INFO 246 Information Visualization course that is offered in the Big Data Certificate Program is more technical in scope. The INFO 246 Information Visualization class that is offered through our MLIS program cannot be used as part of the Big Data Certificate Program. For more information, see our Big Data Certificate FAQs).

Upon successful completion of this course, students will be able to achieve the following Course Learning Outcomes:

  • Describe the perceptual and cognitive principles of information visualization.
  • Use data analysis methods and visualization tools to manage and analyze large collections of abstract information.
  • Identify interaction and interface design issues in visualization.
  • Apply visualization techniques to specific domains of their own interests for knowledge discovery and retrieval.

Course Sequencing

Fall 2016
Big Data Technologies
Web and Data Mining

Spring 2017
Information Visualization
Big Data Technologies

Summer 2017
Web and Data Mining
Information Visualization

Fall 2017
Sequence starts again

Program Learning Outcomes

Upon successful completion of the Big Data certificate curriculum, students will be able to achieve the following Certificate Program Learning Outcomes (CPLO):

Certificate Program Learning Outcomes Description
CPLO 1 Be able to frame Big Data questions, formulate a strategy, and identify applicable technologies and techniques, form a data team, and understand the ethical considerations and risks of Big Data.
CPLO 2 Be able to choose, use, and optimize technologies for loading large-scale data from disparate sources into a Big Data store, for integrating these heterogeneous datasets, as well as for Big Data searching, querying, and analytical processing.
CPLO 3 Be able to analyze, display, communicate, and interpret massive amounts of abstract data effectively and efficiently via visual representations.
CPLO 4 Be able to choose, use, combine, and evaluate techniques and technologies appropriate for different Big Data mining tasks, including supervised and unsupervised learning, link analysis, and recommendation systems.