Big Data course offered at Capitol College



You won’t go far in industry circles these days without hearing about the promise and challenge of Big Data. The term has become a must-know buzzword, used extensively in the media, in corporate presentations, and in academic conversations.

Retailers hope massive data sets will enable them to better track consumer demand, health professionals see the opportunity to improve care while cutting costs, while insurers see a potential tool in combating fraud. And those are only a fraction of the potential applications.

As popular as the term has become, however, its meaning remains loosely defined, says Dr. Mark Moss, who is teaching a course on Big Data this fall at Capitol College.

 “It encompasses different things,” Moss says. “People sweep a lot under the Big Data rug. Everyone is talking about how we need to leverage Big Data, how we can use it, what it can do for us – but often the definition is vague.”

The course -- CS-710, Big Data Warehousing and Analytic Systems – will be offered Thursday nights from 7:00 - 10:15 PM. According to Moss, it is intended for those who would like to gain a clear, high-level understanding of some of the main "big data" concepts and challenges.

Moss plans to focus on three steps: processing raw data gathered from a variety of sources and in numerous formats; analyzing the data to find interesting data points and/or patterns of interest; and presenting the data in a way that can be well-understood by different groups of people. These three steps make up what he terms the “Big Data pipeline”.

As with many courses offered at Capitol, students will not only come away with an overview of theory and concepts, but will also have a chance to put these into practice.

“I feel that the best way to understand and internalize many of these concepts is by using a hands-on approach,” Moss says. “During the course, the class will be given one or more data sets, and will perform these processing, analysis and presentation steps on the given data using some open-source database, data mining and visualization tools.”

“We will provide time to learn the basics of the given tools during the course, so advanced, in-depth knowledge of the tools is not required; however, having some basic computing and mathematics knowledge will be helpful.  For example, we will likely use databases to manage some of the raw data, so having a basic understanding of SQL will be helpful, even if not absolutely required,” Moss says.

According to William Butler, chair of the Information Assurance program at Capitol, the course is the newest addition to the college catalog.

“It addresses a critical need which is fast becoming a cross-cutting capability across vertical markets such as business, healthcare, and insurance,” he said.

 “The National Science Foundation (NSF) has observed that researchers in a growing number of these businesses are generating extremely large and complicated data sets, commonly referred to as Big Data. A wealth of information may be found within these sets, with enormous potential to shed light on some of the toughest and most pressing challenges facing the nation,” Butler said. “To capitalize on this unprecedented opportunity -- to extract insights, discover new patterns and make new connections across disciplines -- we need better tools to access, store, search, visualize and analyze these data,” Butler said.

Enrollment is expected to be high, and spaces are limited, Butler noted, urging students to “sign up while there are still seats!”