Carlos Paradis

Carlos Paradis

Data Scientist

I am a data scientist employed by KBR as a federal contractor for NASA. I hold a MS and PhD in Computer Science from the University of Hawaii at Manoa, and a MS in Software Engineering from the Stevens Institute of Technology. I am also a honor member of IEEE-HKN and ACM-UPE by the same institutions, a lifetime member of ACM, and a recipient of the Science Without Borders scholarship.

Location
94035, Mountain View, California, USA
Email
Website
http://carlosparadis.com
GitHub
GitHub: carlosparadis
DBLP
DBLP
NTRS
NTRS
ORCID
ORCID: 0000-0002-3062-7547

Experience

present

Applied Researcher in Natural Language Processing - Federal Contractor at KBR - NASA Ames Research Center

Applying Natural language Processing (NLP) technologies to streamline the development and certification process for NASA missions

Highlights

  • Develop a NLP capability to parse mission documents using machine learning techniques
  • Look for potential problems (such as contradictions or potential safety issues)
  • Evaluate the results with the help of mission developers
  • List of public released and published work can be found here: https://ntrs.nasa.gov/search?author=Carlos%20Paradis

Graduate Research at University of Hawaiʻi at Manoa - Shidler School of Business

Create a system to analyze software vulnerabilities evolution by mining software repositories ecosystems

Highlights

  • Identify prospective methods to identify poor software development process in collaboration, coordination and communication
  • Identify prospective methods to identify poor architectural practices
  • Create a method to construct timelines of software vulnerabilities
  • Perform a case study in OpenSSL applying identified methods, in particular Heartbleed
  • Create an R package implementing final methodology, and reproducible results
  • Docs/Demo: http://itm0.shidler.hawaii.edu/kaiaulu
  • Conference Article: https://doi.org/10.1007/978-3-031-15116-3_6
  • Pre-Print: https://arxiv.org/abs/2304.14570

Graduate Research at University of Hawaiʻi at Manoa - Shidler School of Business

Identify novel safety incidents based on textual narratives for NASA’s Aviation Safety Reporting System (ASRS)

Highlights

  • Definition and analysis of the performance, stability and random effects of unsupervised text clustering algorithms
  • Definition of a protocol and design of survey heuristics to assess the stability of machine learning results
  • Identify published methodologies for interpretation of text clustering results
  • Define an IRB compliant survey protocol for evaluation of different text clustering interpretation methodologies
  • Reuse and improvement of Legacy Code, and creation of data pipelines to supply data visualization tools
  • Demos: 1) http://www2.hawaii.edu/~cvas/topicflow/ ; 2) http://www2.hawaii.edu/~cvas/termite/
  • Create an R package implementing final methodology, and reproducible results
  • Docs/Demo: http://itm0.shidler.hawaii.edu/kaona
  • Conference Article: https://doi.org/10.2514/6.2021-1981

Intern/Data Analyst at Staffing Solutions at Kaiser Permanente

Highlights

  • Designed and enhanced databases for predictive and prescriptive analytics
  • Designed and evaluated the effectiveness of statistical, probabilistic, and machine learning models
  • Elicited requirements, goals, and priorities from clinical and technical subject matter experts
  • Participated in the development of potential solutions, including outcome and process measures, and technical specifications
  • Translated needs, issues, and ideas into improved processes for patient care
  • Formulated specific implementation plans and evaluated the effectiveness of actions/programs implemented
  • Communicated results/recommendations to project sponsors, clients, and various senior level audiences

Graduate Research at University of Hawaiʻi at Manoa & University of Maryland

Identified CVE-related Software Vulnerability Discourse on Online Mailing List using Topic Modelling

Highlights

  • Created Python Crawler to parse HTML Mailing Lists
  • Created taxonomy to classify different pre-processing strategies of raw corpus
  • Coded review student work via Github's pull requests
  • Performed Topic Modelling using R and LDA
  • Conference Article: https://doi.org/10.1109/ICMLA.2018.00121

Graduate Research / Data Engineer at University of Hawaiʻi at Manoa - School of Architecture

Python coded data pipelines for sensor data processing, database storage, and R coded thermal comfort analyses. Performed Unix server administration duties to support architecture researchers.

Highlights

  • Simplified knowledge management and workflow between supervisor and students from architecture, engineering, and computer science
  • Proposed and code-reviewed a new continuous python sensor data collection plug-in architecture for the lab in Python, Lonoa (github.com/erdl/lonoa)
  • Implemented and explained model constraints for ASHRAE-55 PMV/PPD and Adaptive thermal comfort models (github.com/erdl/thermal_comfort).
  • Provided System Administrator support for the lab (SSH Access, Unix Package Management, and Security)
  • Proposed, designed, and code reviewed two Raspberry PI Environmental Projects (awarded Intel NCS Funding)
  • Designed database schema and code reviewed survey station code for environmental experiments (github.com/erdl/survey_admin)
  • Mentored undergraduate and master student projects
  • Conference Article: http://www.arcc-arch.org/past-conferences/ (ARCC 2019 Conference Proceedings)

Intern II at SGT at NASA Ames Research Center

Created a model of an attack surface between military drone communication by inserting a ‘liar’ script at the communication boundary using OpenUxAS, developed by the Air Force Research Laboratory.

Highlights

  • Conference Article: doi.org/10.2514/6.2019-0770

Graduate Research at University of Hawaiʻi at Manoa & Siemens

Identified associations between software companies’ organizational structure and their software products.

Highlights

  • Created scripts to calculate metrics from version control systems and issue trackers
  • Defined static metrics to evaluate graph motifs on code collaboration social networks and discussion social networks
  • Conference Poster: icse2018.org/event/icse-2018-posters-poster-conway-law-or-not-
  • Journal Article: To appear on IEEE Transactions on Software Engineering

Graduate Research at University of Hawaiʻi at Manoa & HECO

Performed data mining/analytics of weather related data for renewable (solar) energy management in Hawaiʻi. (Master Thesis)

Highlights

  • Created a pipeline to collect data from various sensors across Hawaiʻi
  • Scrape associated metadata through websites and several visualizations to identify missing or biased values through multiple years
  • Used cluster methods to address time series granularity, probabilistic and linear models were also used to forecast solar irradiation on different sites
  • Conference Article: doi.org/10.1109/ICMLA.2015.137

Graduate Research at University of Hawaiʻi at Manoa - Office of Public Health Studies

Analyzed cross-sectional surveys conducted in Ethiopia by DHS to identify effects between sanitation and malnutrition.

Highlights

  • Performed systematic review (database search, snowballing, and inverse snowballing)
  • Surveyed, data pre-processed / cleansed and performed correlation analysis of variables of interest

Lecturer at Universidade Federal da Bahia, Brazil

Lecturer for Information Systems Major

Highlights

  • Databases Lab
  • Data Structures
  • Information Retrieval
  • Material: I used a combination of lecture notes, Latex, ”hand-made” Tikz forms, color, and other visual examples to illustrate abstract concepts for freshman

Undergraduate Teacher Assistant at Universidade Federal da Bahia, Brazil

Teacher Assistant for Computer Science Major

Highlights

  • Formal and Automata Languages
  • Programming Logic for Computer Science
  • Paradigms of Programming Languages
  • Activities consisted of giving lectures under professor supervision, assisted in creating material and grading, and organized guests coding dojos

Undergraduate Research at University of Hawaiʻi at Manoa & Drexel University

Identified patterns of effort in software development.

Highlights

  • Gathered source code and issue tracker data (website, .xml, .csv, .xlsx) using Python and R and storage (PostgreSQL)
  • Performed summary statistics and correlation analysis reports
  • Journal Article: dx.doi.org/10.1016/j.jss.2014.11.015

Undergraduate Research at Universidade Federal da Bahia - Formas

Identified patterns that would lead to STEM student retention in U. Federal da Bahia.

Highlights

  • Gathered and processed data (PDF format transcripts)
  • Created schema and data population pipeline of a PostgreSQL database
  • Performed exploratory data analysis, association rule learning, and literature review.
  • Conference Article: http://dx.doi.org/10.5753/cbie.sbie.2013.577

Undergraduate Research at Universidade Federal da Bahia

Tracked and analyzed programmer behavior Eclipse IDE usage to better understand the impact of conceptualization in god class detection.

Highlights

  • Captured point and click on logs using Eclipse plugins
  • Collected and processed sequence data (.csv)
  • Performed exploratory data analysis and process mining
  • Conference Article: doi.org/10.1145/2460999.2461007

Volunteer

present

Stakeholder/Mentor - ICS 496 at University of Hawaii at Manoa

I serve as a stakeholder/mentor for UH Manoa ICS 496 Capstone projects. The students contribute code and posters (https://github.com/sailuh/kaiaulu_cheatsheet/tree/main/poster) to the open source tool Kaiaulu, which was part of my PhD dissertation also at UH Manoa, and has since been published in a number of scientific works. See project URL for posters, or the tool GitHub (https://github.com/sailuh/kaiaulu) for their contributions.

Highlights

    External Reviewer at Journal of Aerospace Information Systems (JAIS)

    Highlights

      Shadow Program Committee Member at Mining Software Repositories (MSR)

      Highlights

        Reviewer at IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)

        Highlights

          Program Committee Member at International Conference on Machine Learning and Applications (ICMLA)

          Highlights

            Intel Nervana AI Academy Student Ambassador at Intel Nervana

            Highlights

              Volunteer at Code for Hawaiʻi

              I facilitate open-budget access for state of Hawaiʻi citizens, allowing them to make more informed decisions in choosing their state representatives by providing additional accessibility to candidates financial ties: Data cleaning and text linkage of various records concerning candidates campaign contribution distributed on different official pages of the state of Hawaiʻi are put together for navigation on a single website.

              Highlights

                External Reviewer at EDBT/ICDT 2016 Joint Conference

                Highlights

                  Volunteer at Open Knowledge Brazil

                  I volunteered as a data analyst for the Open Spending team of Open Knowledge Brazil Chapter, which was a finalist on Google Impact Challenge in Brazil. The project goal was to analyze the city and state of São Paulo, and also the federal’s fiscal year budget, serving as case study to implement the same method and tools to other cities. Mainly, the project sought to read between the government’s budget lines and understand where tax money was spent. My role in the team was to help write data stories, finding and exploring what data was available in existing outdated websites, and help make it available of it through our website and/or API. I also explored in parallel the open spending of Salvador, another city in Brazil and my hometown as both city and state’s budget data availability.

                  Highlights

                    Volunteer at Science without Borders Network 'Rede CsF'

                    Rede CsF is a non-profit organization based in Brazil and created by students who were awarded the Science Without Border scholarship from Brazil government. The non-profit is a return on investment to create projects in the country to improve science, technology, innovation and education. Within the network, I collaborate on the Open Data Awareness Project and the Intranet, the laterto coordinate over 40 contributors activities.

                    Highlights

                      Chair of Student Chapter at Association for Computing Machinery

                      I founded the first and only ACM student chapter of Brazil. Main chapter activities were initially focused on raising awareness and supporting local computer science events. The chapter was featured on XRDS, the student magazine of ACM, on its first semester.

                      Highlights

                        Education

                        PhD in Computer Science - GPA: 4.0 from University of Hawaiʻi, Honolulu, HI with GPA of 4.0

                        MS in Computer Science - GPA: 4.0 from University of Hawaiʻi, Honolulu, HI with GPA of 4.0

                        MS in Software Engineering - GPA: 3.92 from Stevens Institute of Technology, Hoboken, NJ with GPA of 3.92

                        BS in Computer Science - GPA: 9.5 (0-10) from Universidade Federal da Bahia, Brazil with GPA of 9.5 (0-10)

                        Awards

                        ICS Achievement Scholarship from University of Hawaiʻi at Manoa

                        Featured Idea and Intel NCS Funding from Intel

                        IEEE HKN from IEEE Honor Society HKN Delta Omega

                        Golden Key Honors Member from Golden Key International Honour Society

                        Featured Student Brazilian Award from Brazilian Computer Society

                        Science Without Borders - Capes and LASPAU from Ciencia Sem Fronteiras

                        ACM UPE from ACM Honor Society UPE

                        Textual and Network Analysis of Title 14 CFR Part 107 Waivers (To Appear) by 2024 Digital Avionics Systems Conference (DASC'24)

                        A Grounded Theory of UAS Reported Accidents by 2024 AIAA Aviation Forum

                        Analyzing the Tower of Babel with Kaiaulu by Journal of Systems and Software

                        Making Team Projects with Novices More Effective: An Experience Report by Hawaii International Conference on System Sciences 2024

                        Building the MSR Tool Kaiaulu: Design Principles and Experiences by Journal of Systems and Software

                        Analyzing the Relationship between Community and Design Smells in Open-Source Software Projects: An Empirical Study by Proceedings of the 15th International Conference on Cooperative and Human Aspects of Software Engineering

                        Design Choices in Building an MSR Tool: The Case of Kaiaulu by Companion Proceedings of the 15th European Conference on Software Architecture

                        Assessing the Use of UAS-Related Terms in ASRS Using Seed Topic Modeling by 2023 AIAA SciTech Forum

                        Visualizing Corridors in Terminal Airspace Using Trajectory Clustering by 2022 Digital Avionics Systems Conference (DASC'22)

                        Identifying Emerging Safety Threats Through Topic Modeling in the Aviation Safety Reporting System: A Covid-19 Study by 2021 Digital Avionics Systems Conference (DASC'21)

                        A Survey Protocol to Assess Meaningfulness and Usefulness of Automated Topic Finding in the NASA Aviation Safety Reporting System by 2021 AIAA Aviation Forum

                        Augmenting Topic Finding in the NASA Aviation Safety Reporting System using Topic Modeling by 2021 AIAA SciTech Forum

                        In Search of Socio-Technical Congruence: A Large-Scale Longitudinal Study by IEEE Transactions on Software Engineering

                        Measured: Student Learning Through Monitoring Existing Buildings’ Energy Use And Occupant Comfort by Architectural Research Centers Consortium (ARCC)

                        Towards Explaining Security Defects in Complex Autonomous Aerospace Systems by AIAA Scitech 2019 Forum

                        Indexing Text Related to Software Vulnerabilities in Noisy Communities Through Topic Modelling by 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA 2018)

                        Conway: Law or not? by 2018 40th International Conference On Software Engineering 2018 (ICSE 2018)

                        Probabilistic Models for One-Day Ahead Solar Irradiance Forecasting in Renewable Energy Applications by Internacional Conference on Machine Learning and Applications, Special Track on Machine Learning on Energy Applications (ICMLA 2015)

                        Manufacturing execution systems: A vision for managing software development, Journal of Systems and Software by Journal of Systems and Software, Volume 101, Pages 59-68, ISSN 0164-1212

                        Mining Retention Rules from Student Transcripts: A Case Study of the Information Systems programme at a Federal University by Anais do Simpósio Brasileiro de Informática na Educação, v. 1, p. 1, 2013

                        An exploratory study to investigate the impact of conceptualization in god class detection by Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering - EASE ‘13

                        Teaching Software Engineering Fundamentals in an Introductory Computer Programming Course by Fórum de Educação em Engenharia de Software (FEES), 2011, São Paulo

                        Skills

                        Languages and Frameworks
                        Keywords:
                        • R
                        • Python

                        Interests

                        mining software repositories
                        Keywords:
                          static code analysis
                          Keywords:
                            social network analysis
                            Keywords:
                              text mining
                              Keywords: