Skip to Main Content

BigCyber – 2020

 This Workshop was held in conjunction with the 2020 IEEE International Conference on Big Data (IEEE Big Data 2020), Dec, 2020

Keynote: Dr. Bhavani Thuraisingham, University of Texas at Dallas


The collection, storage, manipulation, analysis, and retention of massive amounts of data have resulted in serious security and privacy considerations. Various regulations are being proposed to handle big data so that the privacy of the individuals is not violated. For example, even if personally identifiable information is removed from the data when data is combined with other data, an individual can be identified. While collecting massive amounts of data causes security and privacy concerns, big data analytics applications in cybersecurity is exploding. For example, an organization can outsource activities such as identity management, intrusion detection, and malware analysis to the cloud. The question is how can the developments in data science techniques be used to solve security and safety problems? Furthermore, how can we ensure that such techniques are secure and adapt to adversarial attacks? How can we handle the privacy implications? There have been many developments in answering such questions in recent years.

One of the critical applications of Data Science over the past year is detecting and preventing the spread of COVD-19. The world has seen pandemics, terrorism, hurricanes, and other natural and man-made disasters. Each time such an event occurs we discuss technologies that can solve the problem and their impact on our privacy and civil liberties. Such discussions occurred after the 9/11 terrorist attacks and are happening now during the COVID-19 pandemic, the worst human crisis we have faced in a century. Data Science is showing a lot of promise in not only detecting and predicting the spread of COVID-19 but also with developing therapeutics and vaccines. We need to develop novel data science techniques for this purpose and at the same time ensure that the techniques are not attacked. However, there are also serious concerns with respect to data privacy that needs to be addressed. Finally, how can one balance conflicting requirements such as safety vs. civil liberties?

This presentation will provide an overview of the security and privacy considerations for data science. Second, it will describe the application of data science including stream data analytics for cybersecurity applications such as malware analysis. Third, it will discuss the trends in areas such as adversarial machine learning that take into consideration the attacker’s behavior. Fourth it will focus on the privacy threats due to the collection of massive amounts of data and potential solutions. Finally, it will address the applications of data science to the COVID-19 pandemic and the tradeoffs between the safety of humans vs. protecting their civil liberties.


Dr. Bhavani Thuraisingham is the Founders Chair Professor of Computer Science and the Executive Director of the Cyber Security Research and Education Institute at the University of Texas at Dallas (UTD). She is also a visiting Senior Research Fellow at Kings College, University of London, and an elected Fellow of the ACM, IEEE, the AAAS, the NAI, and the BCS. Her research interests are on integrating cybersecurity and artificial intelligence/data science for the past 35 years (where it used to be computer security and data management/mining) and she served as a Cyber Security Policy Fellow at the New America Foundation in 2017-8. She has received several prestigious awards including the IEEE CS 1997 Technical Achievement Award, ACM SIGSAC 2010 Outstanding Contributions Award, the IEEE Comsoc Communications and Information Security 2019 Technical Recognition Award, the IEEE CS Services Computing 2017 Research Innovation Award, the ACM CODASPY 2017 Lasting Research Award, the IEEE ISI 2010 Research Leadership Award, the 2017 Dallas Business Journal Women in Technology Award, and the ACM SACMAT 10 Year Test of Time Awards for 2018 and 2019 (for papers published in 2008 and 2009).  She co-chaired the Women in Cyber Security Conference (WiCyS) in 2016 and delivered the featured address at the 2018 Women in Data Science (WiDS) at Stanford University and serves as the Co-Director of both the Women in Cyber Security and Women in Data Science Centers at UTD. Her 40-year career includes industry (Honeywell), federal research laboratory (MITRE), US government (NSF), and US Academia. Her work has resulted in 130+ journal articles, 300+ conference papers, 160+ keynote and featured addresses, six US patents, fifteen books as well as technology transfer of the research to commercial products and operational systems. She has also given featured addresses on data mining for counter-terrorism at the United Nations in New York and at the White House Science and Technology Policy in Washington DC. She received her Ph.D. from the University of Wales, Swansea, UK, and the prestigious earned higher doctorate (D. Eng) from the University of Bristol, UK.

Workshop description

Security analysts need to process high velocity and veracious data for early, ideally left of an exploit, detection of cybersecurity events, such as attacks, data-theft, etc. The problem is challenging given the constantly evolving threat landscape. Even with advanced monitoring, sophisticated persistent attackers can spend as many as 146 days in a system before being detected. Existing systems’ lack of unified organizational view causes information flooding and overwhelms a security analyst with false alarms. We need techniques that reduces an analyst’s cognitive load.

Big data crossing the organizational boundary even in mid-sized environments, need to be mined, examined, analyzed to create ‘Analyst Augmentation Systems’ which will aid security analysts in their day to day operations.

This workshop aims to bring together researchers from Cybersecurity and Big Data to help further homeland security’s missions of anticipation, interdiction, prevention, preparedness, and response. We invite submissions in areas (but not limited to) related to knowledge extraction from cybersecurity intelligence big datasets, fast analysis of security datasets for relevant information, and using this knowledge for various cybersecurity activities like early attack detection, mitigation, remediation, and forensics.