Final Examination Schedule


Winter 2018

Friday, January 19

Yun-Ming Shih

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
10:00 A.M.; UW1 370
MASS HDFS: Multi-Agent Spatial Simulation Hadoop Distributed File System

Parallel computing has been widely used for processing big data, but most systems only handle simple structured data types and do not support complex structured data. As many scientific data are highly structured, such as climatological data, the requirements have to be addressed. UWCA is a web serviced climate analysis application that uses Multi-Agent Spatial Simulation (MASS), a parallelization library created by Distributed Systems Laboratory at Computing & Software Systems Division, University of Washington Bothell (DSLab), for its computations. Without a proper file handling system, the master node becomes a bottleneck causing slow performances. This inspired DSLab to develop the MASS Parallel I/O layer for parallel file reading.

Similar frameworks have been developed to handle structured data in parallel. ROMIO gives a precise control to structured data, but the system has too many features making the system complicated. SciHadoop gives a simple computing model but requires converting science data to text type for processing. Our proposed system, MASS HDFS, is a MASS Parallel I/O layer with additional write functionality and uses HDFS for file distribution. It aims to provide the capability to handle structured data while maintaining simplicity of usage. MASS HDFS can read data into distributed arrays without introducing a single point of bottleneck or data conversion.

The performance evaluation shows that for a 200MB file with up to eight computing nodes, MASS HDFS spends 0.6 seconds on file open and 0.09 seconds on file read.
Opening a 50MB file using 12.5 million Place elements takes 25 minutes. Using 50 million Place elements to open a 50MB file takes four times longer. Reading data with 50 million Place elements can be done within 2.5 minutes, and eight times faster using 12.5 million Place elements. Our project made parallel I/O possible as well as demonstrating the potentiality of processing data on a per-data-item scale using four computing nodes.

Wednesday, February 28

Clayton Johnson

Chair: Dr. Geetha Thamilarasu
Candidate: Master of Science in Cyber Security Engineering
1:00 P.M.; DISC 464
Software Defined Networking: Modifying the OpenFlow Protocol for Mutual Authentication

The innovation of hardware virtualization has become the foundation for modern data centers worldwide as cloud technology continues to grow in popularity. However, this virtual revolution has yet to transcend to the network infrastructure. Traditional networks require dedicated devices, each with their own control logic that is managed independently. These legacy setups are reaching a critical point where they can no longer keep pace with new services such as Big Data and IoT. In response to this virtual era, Software Defined Networking (SDN) was introduced as a new network architecture which provides improved scalability, reduce maintenance overhead, and higher-level programmable functionality, achieved by decoupling the network control logic from the forwarding devices (switches). Over the last several years this architecture has been explored and a promising protocol known as OpenFlow has been introduced for communicating control logic. The OpenFlow protocol has been widely adopted as the foundation for many SDN solutions; however, this protocol relies heavily on the security of SSL/TLS and is often made optional. In this project, we demonstrate that OpenFlow protocol should provide enhanced security features beyond an optional encrypted channel to serve as the backbone for future networked services. We propose modifications to the OpenFlow protocol such that it provides message integrity, mutual authentication, and confirmation of policy state changes. Using experiments, we show that our solution enhances the security of the SDN architecture and can be easily adopted by the Open Networking Foundation.

Monday, March 5

Jeremy Woods

Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Computer Science & Software Engineering
3:00 P.M.; DISC 464
Machine Learning based Malware Detection for Mobile Devices

With the growing prominence of mobile devices and applications, attackers are increasingly targeting the mobile market. The Android Operating System currently accounts for more than 80% of the market for smartphone software. As a result, it is also the largest target for mobile malware attacks. Mobile malware detection is critical to protect mobile devices and applications from security attacks. Existing research shows that mobile malware detection is mostly performed on emulators and virtual machines. These approaches often overlook the resource limitations inherent in mobile platforms. In this project, we develop an application running within the android base framework using machine learning to detect malware in mobile devices. Based on the application, we gather specific features such as network statistics, battery usage statistics, and application permissions to train our machine learning models. Network statistics includes bytes and packets sent and received by an application. Battery usage statistics includes percentage of battery used as well as the total energy used by an application. Applications permissions includes both static and dynamic permissions used by the application. Unlike previous studies, we collect the data and train the model on a real device. In addition, this application is also useful for efficiently extracting data from real devices using a database. We found that using collected malware data, specifically unseen malware from different malware families can be detected based on our selected features. Results show that malware attacks on resource consumption can be detected with an accuracy of 98.39%.

Tuesday, March 6

Nathaniel Grabaskas

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
10:00 A.M.; DISC 464
Automated Parallelization to Improve Usability and Efficiency of Distributed Neural Network Training

The recent success of Deep Neural Networks (DNNs) has triggered a race to build larger and larger DNNs; however, a known limitation is the training speed. To solve this speed problem, distributed neural network training has become an increasingly large area of research. Usability, the complexity for a machine learning or data scientist to implement distributed neural network training, is an aspect rarely considered, yet critical. There is strong evidence growing complexity has a direct impact on development effort, maintainability, and fault proneness of software. We investigated, if automation can greatly reduce the implementation complexity of distributing neural network training across multiple devices without loss of computational efficiency when compared to manual parallelization. Experiments were conducted using Convolutional Neural Networks (CNN) and Multi-Layer Perceptron (MLP) networks to perform image classification on CIFAR-10 and MNIST datasets. Hardware consisted of an embedded, four node NVIDIA Jetson TX1 cluster. Our main contribution is reducing the implementation complexity of data parallel neural network training by more than 90% and providing components, with near zero implementation complexity, to easily parallelize all or only select fully-connected neural layers.

Wednesday, March 7

Longfei Xi

Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
10:00 A.M.; DISC 464
A Cloud-based Architecture for Sharing Health- and Education-based Data: The Children’s Vision, Learning, and Technologies Project

EYE (Educating Young Eyes) Center for Children's Vision Learning & Technology is a university-sponsored non-profit organization dedicated to the research, development, and education of technologies to help increase awareness of the importance of functional vision in children learning process. Since its establishment, various apps and games have been developed for children near vision issues, led to the need for centralized data collection and sharing for further research. This project presents EYE Data Service, a cloud-based back-end system for health and education research data sharing in EYE Center. It utilizes microservice architecture and technologies like RESTful API and JSON Web Tokens, making it a cloud-optimized online service supporting data collection, sharing, and management for apps and games in any platforms. With considerations of availability, security, performance, and maintainability, it provides a complete set of features for user authentication, user management, and app data management as standard Web APIs. During testing, multiple EYE Center’s apps developed in Unity, PHP, and .NET Framework have been successfully connected to EYE Data Service APIs. The test shows that with proper user credentials, apps can submit their data or retrieve them for further analysis in a secure manner. It also shows EYE Data Service can handle API access controls based on user roles.

Pankaj Maheshwari

Chair: Dr. Brent Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
3:00 P.M.; DISC 464
Automated Resource Provisioning Model for High Performance Computations in the Cloud

Flexible resource management is one of the key features of cloud-based solutions. Using this feature, users can increase or decrease the amount of computational resources of the cloud at any time, enabling applications to dynamically scale computing and storage resources, avoiding over- and under-provisioning. In high performance computing (HPC), initiatives like bag-of tasks or key-value applications use a load balancer and a loosely-coupled set of virtual machine (VM) instances. In such scenario, it is easier to add or remove virtual machines because the load balancer is in charge of distributing tasks between the active processes. However, iterative HPC applications are characterized by being tightly-coupled and have difficulty to take advantage of the elasticity because in such applications the amount of processes is fixed throughout the application run-time. In fact, the simple addition of new resources does not guarantee that the processes will use them. Moreover, removing a single process can compromise the entire execution of the application because each process plays a key role in its execution cycle. In the joint-field of MPI (Message Passing Interface) and tightly-coupled HPC applications, it is a challenge to use the resource elasticity feature since we need to re-write the source code to address resource reorganization. To address these issues, a PaaS-based Automated Resource Provisioning Model is developed which acts as a resource manager and load balancer for iterative HPC applications running over cloud infrastructures. The model offers resource elasticity automatically, where the user does not need to configure any resource management policy. This mechanism includes using fixed as well as dynamic thresholds. In dynamic thresholds, the threshold values are self-adjusted during the application execution. The framework provides asynchronous elasticity, i.e., ability to allow applications to either increase or decrease their computing resources without blocking the current execution. The framework’s viability is demonstrated through execution of a CPU-bound numerical wave high performance integration computation over OpenNebula environment. Results demonstrated performance gains for HPC application ranging from 28.4% to 59% evaluated over varying scenarios. Furthermore, tests show more optimized resource consumption by the application with dynamic thresholds.

Thursday, March 8

Hon Choi Wong

Acting Chair: Dr. Hazeline Asuncion
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; DISC 464
Workbench Dashboard – Managing Experiments using Data and Software Provenance

In e-Science, scientists use computer programs and data to run simulations. The process uses and generates a lot of artifacts including program code, executable software, input and output files. The complexity of the relationships between artifacts grows with time and making it a pain point to comprehend the relationships. Although some existing systems can relieve the pain by visualizing the data provenance and managing the workflow of the experiments, they do not show software provenance in the visualization and do not use the visualization to help analysis of results. This project aims at creating a software system, called “Workbench Dashboard”, to visualize the artifacts and relationships between them, based on data provenance and software provenance, to help scientists to understand the relationships faster, and analyze the results of experiments quickly based on the visualization. The usability evaluation shows that the visualization and the features in the Dashboard could help users search artifacts and their relationships easily. The participants in the evaluation had both positive and negative opinions about the application. The future work will focus on improving the usability and using the dashboard to visualize artifacts included in simulations created by different simulation software applications.

Thursday, March 15

Miles Dowe

Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
3:00 P.M.; UW1 370
Informing a Laughter Recognition Algorithm Through Qualitative Coding

Computer-assisted qualitative data analysis software (CAQDAS) are applications that help qualitative research by storing coded data samples in queryable databases. While some CAQDAS use machine learning to simplify the work of a user, it appears no relationship has been demonstrated where qualitative coding applications help to influence machine learning models. In refactoring a laugh finding algorithm to operate at scale, a web application labeled the Laughter Analysis System (LAS) was developed and hosted on Microsoft Azure. It consists of a web interface, a RESTful backend affective computing service, and a relational database. Functioning as a sort of CAQDAS, the LAS enables a user to immediately find and qualitatively code laughter instances in videos from the BeamCoffer corpus, while also providing coded instances as samples for model re-training. New models can then be created, while earlier models can also be reused for refining.

Back to top

Questions: Please email