Applied Data Scientist

1/25/2026

The applied data scientist will develop and deploy batch data pipelines and support real-time NLP applications. They will also manage data pipelines and collaborate with various teams to enhance data product development.

Working Hours

40 hours/week

Company Size

51-200 employees

Language

English

Visa Sponsorship

About The Company

Start.io is a sell-side omnichannel advertising platform powered by real-time mobile audiences. We deliver hundreds of millions of ads per day across more than 500,000 active apps. Our platform uses artificial intelligence to deliver more efficient, effective and precise digital advertising campaigns. Our direct integration with thousands of mobile publishers gives us access to more than 50 billion first-party data signals per day across the globe. Marketers use these anonymized signals to understand and predict consumer behavior, identify new opportunities and fuel business growth.

About the Role

Start.io is a mobile marketing and audience platform. Start.io (formerly StartApp) empowers the mobile app ecosystem and simplifies mobile marketing, audience building, and mobile monetization. Start.io's direct integration with over 500,000 monthly active mobile apps provides access to unprecedented global first-party data, which can be leveraged to understand and predict behaviors, identify new opportunities, and fuel growth.

Our Data Science team is looking for an enthusiastic junior applied data scientist to join and handle batch processing and predictions for tens of billions of events from billions of users.

You’ll have the opportunity to impact on a daily level, our data products development in a collaborative culture.

Responsibilities:

This position will combine data engineering and data science tasks.

Develop and deploy batch data pipelines processing billions of data signals.

Support real-time NLP applications.

Design data flows and develop utilities that allow the team members to work more efficiently.

Manage data pipelines from and to all of our data sources, such as Vertica and AWS S3.

Develop CI/CD pipelines.

Build and maintain classical ML models as well as deep learning models for massive amounts of events.

Work closely with DS, DA, DevOps, ETL engineers, and product managers.

Build production-grade LLM and RAG systems for data-driven classification and prediction.

Requirements

1-2 years as a data scientist – A must.

B.Sc. in engineering, computer science, or other quantitative fields – A must.

Practical experience with Python and SQL – A must.

Understanding ML and DL concepts.

Passion for data and DS workflows, independence, and being a team player.

Nice to have:

Practical experience with container-based development (e.g., Docker).

Experience in the design and development of scalable big data solutions.

Experience with AWS platform – S3, SQS, etc.

Experience with PySpark – A big advantage!

Key Skills

PythonSQLMachine LearningDeep LearningData EngineeringNLPAWSData PipelinesCI/CDBig DataPySparkData ScienceData ProductsBatch ProcessingData FlowsTeam Collaboration