Carlos Paradis
Data Scientist
I am a data scientist employed by KBR as a federal contractor for NASA. I hold a MS and PhD in Computer Science from the University of Hawaii at Manoa, and a MS in Software Engineering from the Stevens Institute of Technology. I am also a honor member of IEEE-HKN and ACM-UPE by the same institutions, a lifetime member of ACM, and a recipient of the Science Without Borders scholarship.
- Location
- 94035, Mountain View, California, USA
- cvas@hawaii.edu
- Website
- http://carlosparadis.com
- GitHub
- GitHub: carlosparadis
- DBLP
- DBLP
- NTRS
- NTRS
- ORCID
- ORCID: 0000-0002-3062-7547
Experience
– present
Applied Researcher in Natural Language Processing - Federal Contractor at KBR - NASA Ames Research Center
Applying Natural language Processing (NLP) technologies to streamline the development and certification process for NASA missions
Highlights
- Develop a NLP capability to parse mission documents using machine learning techniques
- Look for potential problems (such as contradictions or potential safety issues)
- Evaluate the results with the help of mission developers
- List of public released and published work can be found here: https://ntrs.nasa.gov/search?author=Carlos%20Paradis
–
Graduate Research at University of Hawaiʻi at Manoa - Shidler School of Business
Create a system to analyze software vulnerabilities evolution by mining software repositories ecosystems
Highlights
- Identify prospective methods to identify poor software development process in collaboration, coordination and communication
- Identify prospective methods to identify poor architectural practices
- Create a method to construct timelines of software vulnerabilities
- Perform a case study in OpenSSL applying identified methods, in particular Heartbleed
- Create an R package implementing final methodology, and reproducible results
- Docs/Demo: http://itm0.shidler.hawaii.edu/kaiaulu
- Conference Article: https://doi.org/10.1007/978-3-031-15116-3_6
- Pre-Print: https://arxiv.org/abs/2304.14570
–
Graduate Research at University of Hawaiʻi at Manoa - Shidler School of Business
Identify novel safety incidents based on textual narratives for NASA’s Aviation Safety Reporting System (ASRS)
Highlights
- Definition and analysis of the performance, stability and random effects of unsupervised text clustering algorithms
- Definition of a protocol and design of survey heuristics to assess the stability of machine learning results
- Identify published methodologies for interpretation of text clustering results
- Define an IRB compliant survey protocol for evaluation of different text clustering interpretation methodologies
- Reuse and improvement of Legacy Code, and creation of data pipelines to supply data visualization tools
- Demos: 1) http://www2.hawaii.edu/~cvas/topicflow/ ; 2) http://www2.hawaii.edu/~cvas/termite/
- Create an R package implementing final methodology, and reproducible results
- Docs/Demo: http://itm0.shidler.hawaii.edu/kaona
- Conference Article: https://doi.org/10.2514/6.2021-1981
–
Intern/Data Analyst at Staffing Solutions at Kaiser Permanente
Highlights
- Designed and enhanced databases for predictive and prescriptive analytics
- Designed and evaluated the effectiveness of statistical, probabilistic, and machine learning models
- Elicited requirements, goals, and priorities from clinical and technical subject matter experts
- Participated in the development of potential solutions, including outcome and process measures, and technical specifications
- Translated needs, issues, and ideas into improved processes for patient care
- Formulated specific implementation plans and evaluated the effectiveness of actions/programs implemented
- Communicated results/recommendations to project sponsors, clients, and various senior level audiences
–
Graduate Research at University of Hawaiʻi at Manoa & University of Maryland
Identified CVE-related Software Vulnerability Discourse on Online Mailing List using Topic Modelling
Highlights
- Created Python Crawler to parse HTML Mailing Lists
- Created taxonomy to classify different pre-processing strategies of raw corpus
- Coded review student work via Github's pull requests
- Performed Topic Modelling using R and LDA
- Conference Article: https://doi.org/10.1109/ICMLA.2018.00121
–
Graduate Research / Data Engineer at University of Hawaiʻi at Manoa - School of Architecture
Python coded data pipelines for sensor data processing, database storage, and R coded thermal comfort analyses. Performed Unix server administration duties to support architecture researchers.
Highlights
- Simplified knowledge management and workflow between supervisor and students from architecture, engineering, and computer science
- Proposed and code-reviewed a new continuous python sensor data collection plug-in architecture for the lab in Python, Lonoa (github.com/erdl/lonoa)
- Implemented and explained model constraints for ASHRAE-55 PMV/PPD and Adaptive thermal comfort models (github.com/erdl/thermal_comfort).
- Provided System Administrator support for the lab (SSH Access, Unix Package Management, and Security)
- Proposed, designed, and code reviewed two Raspberry PI Environmental Projects (awarded Intel NCS Funding)
- Designed database schema and code reviewed survey station code for environmental experiments (github.com/erdl/survey_admin)
- Mentored undergraduate and master student projects
- Conference Article: http://www.arcc-arch.org/past-conferences/ (ARCC 2019 Conference Proceedings)
–
Intern II at SGT at NASA Ames Research Center
Created a model of an attack surface between military drone communication by inserting a ‘liar’ script at the communication boundary using OpenUxAS, developed by the Air Force Research Laboratory.
Highlights
- Conference Article: doi.org/10.2514/6.2019-0770
–
Graduate Research at University of Hawaiʻi at Manoa & Siemens
Identified associations between software companies’ organizational structure and their software products.
Highlights
- Created scripts to calculate metrics from version control systems and issue trackers
- Defined static metrics to evaluate graph motifs on code collaboration social networks and discussion social networks
- Conference Poster: icse2018.org/event/icse-2018-posters-poster-conway-law-or-not-
- Journal Article: To appear on IEEE Transactions on Software Engineering
–
Graduate Research at University of Hawaiʻi at Manoa & HECO
Performed data mining/analytics of weather related data for renewable (solar) energy management in Hawaiʻi. (Master Thesis)
Highlights
- Created a pipeline to collect data from various sensors across Hawaiʻi
- Scrape associated metadata through websites and several visualizations to identify missing or biased values through multiple years
- Used cluster methods to address time series granularity, probabilistic and linear models were also used to forecast solar irradiation on different sites
- Conference Article: doi.org/10.1109/ICMLA.2015.137
–
Graduate Research at University of Hawaiʻi at Manoa - Office of Public Health Studies
Analyzed cross-sectional surveys conducted in Ethiopia by DHS to identify effects between sanitation and malnutrition.
Highlights
- Performed systematic review (database search, snowballing, and inverse snowballing)
- Surveyed, data pre-processed / cleansed and performed correlation analysis of variables of interest
–
Lecturer at Universidade Federal da Bahia, Brazil
Lecturer for Information Systems Major
Highlights
- Databases Lab
- Data Structures
- Information Retrieval
- Material: I used a combination of lecture notes, Latex, ”hand-made” Tikz forms, color, and other visual examples to illustrate abstract concepts for freshman
–
Undergraduate Teacher Assistant at Universidade Federal da Bahia, Brazil
Teacher Assistant for Computer Science Major
Highlights
- Formal and Automata Languages
- Programming Logic for Computer Science
- Paradigms of Programming Languages
- Activities consisted of giving lectures under professor supervision, assisted in creating material and grading, and organized guests coding dojos
–
Undergraduate Research at University of Hawaiʻi at Manoa & Drexel University
Identified patterns of effort in software development.
Highlights
- Gathered source code and issue tracker data (website, .xml, .csv, .xlsx) using Python and R and storage (PostgreSQL)
- Performed summary statistics and correlation analysis reports
- Journal Article: dx.doi.org/10.1016/j.jss.2014.11.015
–
Undergraduate Research at Universidade Federal da Bahia - Formas
Identified patterns that would lead to STEM student retention in U. Federal da Bahia.
Highlights
- Gathered and processed data (PDF format transcripts)
- Created schema and data population pipeline of a PostgreSQL database
- Performed exploratory data analysis, association rule learning, and literature review.
- Conference Article: http://dx.doi.org/10.5753/cbie.sbie.2013.577
–
Undergraduate Research at Universidade Federal da Bahia
Tracked and analyzed programmer behavior Eclipse IDE usage to better understand the impact of conceptualization in god class detection.
Highlights
- Captured point and click on logs using Eclipse plugins
- Collected and processed sequence data (.csv)
- Performed exploratory data analysis and process mining
- Conference Article: doi.org/10.1145/2460999.2461007
Volunteer
– present
Stakeholder/Mentor - ICS 496 at University of Hawaii at Manoa
I serve as a stakeholder/mentor for UH Manoa ICS 496 Capstone projects. The students contribute code and posters (https://github.com/sailuh/kaiaulu_cheatsheet/tree/main/poster) to the open source tool Kaiaulu, which was part of my PhD dissertation also at UH Manoa, and has since been published in a number of scientific works. See project URL for posters, or the tool GitHub (https://github.com/sailuh/kaiaulu) for their contributions.
Highlights
–
External Reviewer at Journal of Aerospace Information Systems (JAIS)
Highlights
–
Shadow Program Committee Member at Mining Software Repositories (MSR)
Highlights
–
Reviewer at IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)
Highlights
–
Program Committee Member at International Conference on Machine Learning and Applications (ICMLA)
Highlights
–
Intel Nervana AI Academy Student Ambassador at Intel Nervana
Highlights
–
Volunteer at Code for Hawaiʻi
I facilitate open-budget access for state of Hawaiʻi citizens, allowing them to make more informed decisions in choosing their state representatives by providing additional accessibility to candidates financial ties: Data cleaning and text linkage of various records concerning candidates campaign contribution distributed on different official pages of the state of Hawaiʻi are put together for navigation on a single website.
Highlights
–
External Reviewer at EDBT/ICDT 2016 Joint Conference
Highlights
–
Volunteer at Open Knowledge Brazil
I volunteered as a data analyst for the Open Spending team of Open Knowledge Brazil Chapter, which was a finalist on Google Impact Challenge in Brazil. The project goal was to analyze the city and state of São Paulo, and also the federal’s fiscal year budget, serving as case study to implement the same method and tools to other cities. Mainly, the project sought to read between the government’s budget lines and understand where tax money was spent. My role in the team was to help write data stories, finding and exploring what data was available in existing outdated websites, and help make it available of it through our website and/or API. I also explored in parallel the open spending of Salvador, another city in Brazil and my hometown as both city and state’s budget data availability.
Highlights
–
Volunteer at Science without Borders Network 'Rede CsF'
Rede CsF is a non-profit organization based in Brazil and created by students who were awarded the Science Without Border scholarship from Brazil government. The non-profit is a return on investment to create projects in the country to improve science, technology, innovation and education. Within the network, I collaborate on the Open Data Awareness Project and the Intranet, the laterto coordinate over 40 contributors activities.
Highlights
–
Chair of Student Chapter at Association for Computing Machinery
I founded the first and only ACM student chapter of Brazil. Main chapter activities were initially focused on raising awareness and supporting local computer science events. The chapter was featured on XRDS, the student magazine of ACM, on its first semester.
Highlights
Education
–
PhD in Computer Science - GPA: 4.0 from University of Hawaiʻi, Honolulu, HI with GPA of 4.0
–
MS in Computer Science - GPA: 4.0 from University of Hawaiʻi, Honolulu, HI with GPA of 4.0
–
MS in Software Engineering - GPA: 3.92 from Stevens Institute of Technology, Hoboken, NJ with GPA of 3.92
–
BS in Computer Science - GPA: 9.5 (0-10) from Universidade Federal da Bahia, Brazil with GPA of 9.5 (0-10)
Awards
ICS Achievement Scholarship from University of Hawaiʻi at Manoa
Featured Idea and Intel NCS Funding from Intel
IEEE HKN from IEEE Honor Society HKN Delta Omega
Golden Key Honors Member from Golden Key International Honour Society
Featured Student Brazilian Award from Brazilian Computer Society
Science Without Borders - Capes and LASPAU from Ciencia Sem Fronteiras
ACM UPE from ACM Honor Society UPE
Publications
A Socio-technical Perspective on Software Vulnerabilities: A Causal Analysis (To appear) by Journal of Systems and Software
Textual and Network Analysis of Title 14 CFR Part 107 Waivers (To Appear) by 2024 Digital Avionics Systems Conference (DASC'24)
A Grounded Theory of UAS Reported Accidents by 2024 AIAA Aviation Forum
Analyzing the Tower of Babel with Kaiaulu by Journal of Systems and Software
Making Team Projects with Novices More Effective: An Experience Report by Hawaii International Conference on System Sciences 2024
Building the MSR Tool Kaiaulu: Design Principles and Experiences by Journal of Systems and Software
Analyzing the Relationship between Community and Design Smells in Open-Source Software Projects: An Empirical Study by Proceedings of the 15th International Conference on Cooperative and Human Aspects of Software Engineering
Design Choices in Building an MSR Tool: The Case of Kaiaulu by Companion Proceedings of the 15th European Conference on Software Architecture
Assessing the Use of UAS-Related Terms in ASRS Using Seed Topic Modeling by 2023 AIAA SciTech Forum
Visualizing Corridors in Terminal Airspace Using Trajectory Clustering by 2022 Digital Avionics Systems Conference (DASC'22)
Identifying Emerging Safety Threats Through Topic Modeling in the Aviation Safety Reporting System: A Covid-19 Study by 2021 Digital Avionics Systems Conference (DASC'21)
A Survey Protocol to Assess Meaningfulness and Usefulness of Automated Topic Finding in the NASA Aviation Safety Reporting System by 2021 AIAA Aviation Forum
Augmenting Topic Finding in the NASA Aviation Safety Reporting System using Topic Modeling by 2021 AIAA SciTech Forum
In Search of Socio-Technical Congruence: A Large-Scale Longitudinal Study by IEEE Transactions on Software Engineering
Measured: Student Learning Through Monitoring Existing Buildings’ Energy Use And Occupant Comfort by Architectural Research Centers Consortium (ARCC)
Towards Explaining Security Defects in Complex Autonomous Aerospace Systems by AIAA Scitech 2019 Forum
Indexing Text Related to Software Vulnerabilities in Noisy Communities Through Topic Modelling by 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA 2018)
Conway: Law or not? by 2018 40th International Conference On Software Engineering 2018 (ICSE 2018)
Probabilistic Models for One-Day Ahead Solar Irradiance Forecasting in Renewable Energy Applications by Internacional Conference on Machine Learning and Applications, Special Track on Machine Learning on Energy Applications (ICMLA 2015)
Manufacturing execution systems: A vision for managing software development, Journal of Systems and Software by Journal of Systems and Software, Volume 101, Pages 59-68, ISSN 0164-1212
Mining Retention Rules from Student Transcripts: A Case Study of the Information Systems programme at a Federal University by Anais do Simpósio Brasileiro de Informática na Educação, v. 1, p. 1, 2013
An exploratory study to investigate the impact of conceptualization in god class detection by Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering - EASE ‘13
Teaching Software Engineering Fundamentals in an Introductory Computer Programming Course by Fórum de Educação em Engenharia de Software (FEES), 2011, São Paulo
Skills
- Languages and Frameworks
- Keywords:
Interests
- mining software repositories
- Keywords:
- static code analysis
- Keywords:
- social network analysis
- Keywords:
- text mining
- Keywords: