Collin Martin

Data Science & Engineering Student @ UC3M

Building + Learning all things data. Always looking to collaborate and learn new things in all technical domains 'If you want to get faster, run with people who are faster than you'

Scroll to explore

About

I'm a Data Science and Data Engineering student at Universidad Carlos III de Madrid, a Polytechnic University in Madrid, Spain — expected to graduate in May 2027.

Before that, I spent years operating heavy machinery on wheat farms in Odessa, Washington, where I developed a strong work ethic through frequent 14-hour harvest days. That resilience carries into everything I do.

Now I build end-to-end data pipelines, machine learning models, and automation systems. My recent internship at Agops360 had me designing cloud-native pipelines for agricultural data, data architecture, and IoT telemetry.

I'm looking to return to my roots and establish a career in Washington.

Data Engineering

Building end-to-end pipelines, cloud data synchronization, and structured data extraction at scale

Machine Learning

Statistical modeling, predictive analytics, classification, and unsupervised learning techniques

Agricultural Tech

IoT irrigation systems, farm management APIs, and pesticide regulatory data pipelines

Automation

OCR processing, serverless workflows, API integration, and cron-based data enrichment

Projects & Publications

AI-Driven Pesticide Data Pipeline

Created an end-to-end data pipeline for the U.S. EPA's pesticide registration dataset (PPIS). Automated data extraction, cleaning, and synchronization with cloud databases, while employing an LLM agent to extract REI, PHI, and PPE data from PDF pesticide labels in Python.

Farm Management API Integration

Developed event-driven data pipelines for an agricultural equipment manufacturer's API to fetch and manage field boundaries, crop types, work plans, and equipment activity. Implemented secure OAuth2 token handling and refresh logic.

Irrigation Control System Integration

Built a data ingestion system connecting an IoT-enabled irrigation network using serverless edge functions, cloud functions, and pub/sub messaging. Captured telemetry data such as pressure, pivot angles, fault states, and communication health.

OCR Document Processing Pipeline

Designed a fully automated OCR pipeline leveraging cloud-based text extraction services within serverless functions to process PDF irrigation design documents. Parsed and transformed tabular data into CSVs for analytical use.

E-Commerce Purchase Prediction

Led development of machine learning models achieving 89.7% accuracy in predicting e-commerce purchase behavior. Engineered and optimized five classification models (Gradient Boosting, Random Forest, Neural Networks, LDA/QDA) and implemented a risk analysis framework combining PageValues and purchase probabilities.

Skills & Certifications

LANGUAGES & FRAMEWORKS

Python TensorFlow FastAPI scikit-learn Claude Streamlit SQL Jupyter R

TOOLS & PLATFORMS

Docker Neo4j Figma Hugging Face

CERTIFICATIONS

Coming soon

Notes

Writing and notes on data engineering, ML, and whatever I'm learning.

Connect

Follow along on my journey, reach out for collaborations, or just say hi.