Data Engineer in Japan: REMOTE + NO JAPANESE REQUIRED

Let’s be clear.

This blog is about Data Engineers in Japan.

What they do, how they do it, and how you can do it too.

Got it?

Good.

There’s a lot to cover, so let’s get started.

 

What is this role?

A fast-growing travel tech company is looking for a Senior Data Engineer to help build and improve the data platform behind its global hospitality products.

The company develops technology used by hotels to streamline operations, improve efficiency, and drive sales growth. Behind those products is a modern data platform that supports internal business intelligence, recommendation engines, AI chatbot products, and future AI agent use cases.

In this role, you will join a small but highly experienced ML/Data/AI team and work closely with two veteran Data Engineers, ML Engineers, product managers, and internal stakeholders.

This role is especially focused on:

  • Building and improving large-scale data pipelines

  • Modeling complex hospitality industry datasets

  • Improving the company’s data catalog and metadata

  • Helping humans and AI agents discover and query data more effectively

  • Supporting AI-driven products such as recommendation engines and chatbots

  • Improving data platform performance and cost efficiency

A major focus of the role is making the company’s data easier to discover, understand, and query — not only for humans, but also for AI agents that need to find the right data and construct accurate, high-performance queries.

 

What you will do

This role sits at the intersection of data engineering, data architecture, AI infrastructure, and hospitality technology.

You will help the company ingest and model new external data sources that provide a deeper view of the hospitality industry. These datasets can support everything from business intelligence to AI-powered product features, recommendation engines, and chatbot experiences.

Key areas of work include:

  • Ingesting new external data sources related to the hospitality industry

  • Designing data models for critical business and product entities

  • Improving the usability of the company’s data platform

  • Helping internal users find and query the right data

  • Supporting ML and AI teams with reliable data features and pipelines

  • Reducing unnecessary compute costs through better modeling and query design

This is not just a pipeline-building role. You will also help shape how the company organizes, explains, and exposes data across the business.

 

Key Responsibilities

As a Senior Data Engineer, you will design, develop, and maintain robust batch and streaming data pipelines using tools such as Apache Beam, Spark, and BigQuery.

Your main responsibilities will include:

  • Designing, developing, and maintaining batch and streaming data pipelines

  • Processing large-scale hospitality industry datasets

  • Improving data models for the company’s most important entities

  • Making sure data consumers can query accurate data efficiently

  • Optimizing data models and queries to control compute costs

  • Working with data producers to improve metadata and data discoverability

  • Supporting data catalog and platform initiatives

  • Helping humans and AI agents find, understand, and use the right data

  • Collaborating with ML Engineers on data features and pipelines

  • Supporting AI products, recommendation engines, and chatbot-related features

You will also play an important role in making the data platform more reliable, scalable, and useful for teams across the organization.

 

TechStack

The team works with a modern data engineering stack. You do not necessarily need hands-on experience with every tool, but you should have strong experience building scalable data systems and be comfortable learning new technologies.

The main technical stack includes:

  • Data warehouse: BigQuery

  • Processing: Apache Beam / Dataflow, Spark, DuckDB

  • Orchestration and transformation: dbt, Airflow, Prefect

  • Languages: Python, Scala, or Java

  • Cloud and storage: GCS and AWS

  • Relevant AWS services: AWS Glue and related data services

Experience with BigQuery, Apache Beam, Spark, Dataflow, AWS Glue, dbt, Airflow, or Prefect would be especially relevant.

 

Required Experience

The ideal candidate will have around 5+ years of combined experience in data engineering, data architecture, or software engineering.

This role is especially suited for someone who has hands-on experience developing data pipelines and data models at a large scale. Experience in technology-driven environments such as major internet companies, e-commerce platforms, large retailers, or high-volume data organizations would be highly relevant.

Relevant backgrounds may include:

  • Data Engineering

  • Data Architecture

  • Backend Engineering

  • Software Engineering

  • Database Administration

  • BI Engineering

  • Data Science with production data system experience

A background in software engineering is a strong plus because the team values engineering best practices, maintainability, scalability, and clean system design.

The company is especially interested in candidates who can show:

  • Experience building production-grade data pipelines

  • Strong understanding of data modeling

  • Ability to work with large-scale datasets

  • Experience improving data quality, performance, and reliability

  • Ability to lead technical projects

  • Clear communication with technical and non-technical stakeholders

  • Interest in modern data platforms and AI-related infrastructure

 

Language Requirements

This role requires fluency in English.

The team includes engineers and product managers from diverse backgrounds, so you should be comfortable using English in technical discussions and cross-functional collaboration.

You should be able to:

  • Explain technical ideas clearly in English

  • Discuss architecture and data platform decisions

  • Collaborate with engineers, ML teams, and product managers

  • Communicate trade-offs around performance, cost, and reliability

  • Work effectively in an international team environment

Based on the current requirements, Japanese is not listed as a requirement.

 

Nice-to-Have Experience

Experience with AI-related data infrastructure would be a strong advantage.

This includes experience or interest in:

  • RAG

  • Context-aware AI systems

  • Agent orchestration

  • Embeddings

  • Knowledge graphs

  • Vector databases

  • Data catalogs

  • Semantic layers

  • Metadata management

  • AI-ready data infrastructure

You do not need to be an ML Engineer, but you should be interested in how data platforms can support the next generation of AI-powered products.

Because part of the role involves making data more discoverable and usable by AI agents, experience thinking about metadata, structured data access, and query accuracy would be highly relevant.

 

Who This Role Is Good For

This role is a strong fit for someone who enjoys building data systems that are both technically solid and directly useful to the business.

You may be a good match if you:

  • Have built large-scale batch or streaming data pipelines

  • Care about data quality, modeling, reliability, and performance

  • Enjoy reducing complexity and making data easier for others to use

  • Want to work on AI-adjacent data infrastructure

  • Like working in a small, senior team where your decisions have visible impact

  • Prefer a fully remote work style

  • Are interested in travel tech, hospitality data, and global products

  • Want to work with experienced engineers on complex data platform problems

This is not just a maintenance role. You will be helping shape how the company’s data platform evolves as AI agents, recommendation systems, and chatbot products become more important to the business.

 

How to Stand Out as an Applicant

To stand out, you should show that you can think beyond pipeline implementation.

The strongest candidates will be able to explain how they have designed data systems that are reliable, scalable, cost-conscious, and easy for others to use.

In your resume and interviews, it would be helpful to highlight:

  • Large-scale data pipelines you have built or maintained

  • Data models you designed or improved

  • Query performance improvements

  • Cost optimization projects

  • Data catalog or metadata initiatives

  • Work with ML Engineers or AI-related teams

  • Experience with BigQuery, Spark, Beam, Dataflow, AWS Glue, dbt, Airflow, or Prefect

  • Examples of leading technical projects

  • Times you worked with business or product stakeholders

  • The scale of the systems you worked on, such as data volume, number of pipelines, or business impact

AI-related experience is also worth emphasizing, especially if you have worked with:

  • RAG

  • Embeddings

  • Vector databases

  • Knowledge graphs

  • AI agents

  • Recommendation systems

  • Chatbot data infrastructure

The key is to show not only what you built, but why it mattered and how it improved the platform, product, or business.

 

Career Path

This role can be a strong next step for an experienced Data Engineer who wants to move into more strategic data platform ownership.

From this role, possible career paths include:

  • Lead Data Engineer

  • Data Architect

  • Data Platform Lead

  • Analytics Engineering Lead

  • AI/Data Infrastructure Lead

  • Engineering Manager for Data or Platform teams

  • Technical Lead for AI infrastructure initiatives

Because the team is small and senior, you will likely have opportunities to influence architecture, improve platform standards, and shape how the organization uses data.

This could also be a good transition role for someone coming from:

  • Backend engineering into data platform engineering

  • DBA work into modern cloud data infrastructure

  • BI engineering into large-scale data architecture

  • Data science into production-grade data engineering

  • Data engineering into AI infrastructure

For someone who wants to move from implementation-focused engineering into broader technical leadership, this role offers a clear path.

 

Compensation and Work Style

The salary is up to ¥12M JPY, depending on experience.

The role is fully remote, making it suitable for candidates who want flexibility while still working on complex, high-impact technical challenges.

Key details include:

  • Salary: Up to ¥12M JPY

  • Work style: Fully remote

  • Team: 5-person ML/Data/AI unit

  • Environment: International and engineering-driven

  • Focus: Data platform, AI infrastructure, hospitality datasets, and product impact

You will be part of a diverse engineering and product environment, working with people who are building unique products for the hospitality and travel industry.

 

Interview Process

The interview process is structured and practical.

The expected process includes:

  1. Initial Screen
    A 30-minute conversation with the Hiring Manager. This is usually an opportunity to discuss your background, motivation, technical experience, and fit for the role.

  2. Practical Assessment
    A take-home assignment designed to help the team understand how you approach real data engineering problems.

  3. Technical Interview
    A deep-dive interview with the two current Data Engineers. This stage will likely explore your technical experience, system design thinking, data modeling ability, and approach to scalable data pipelines.

  4. Final Interview
    A final leadership review to assess overall fit, communication style, and alignment with the team’s goals.

To prepare, you should be ready to discuss:

  • Past data pipeline projects

  • Data modeling decisions

  • Big data architecture

  • Query optimization

  • Cost-performance trade-offs

  • Technical leadership examples

  • Collaboration with ML, product, or business teams

 

FAQ

Is this a pure data pipeline role?

No. Pipeline development is important, but the role also involves:

  • Data modeling

  • Platform improvements

  • Metadata and data catalog work

  • Cost optimization

  • AI-ready data infrastructure

  • Collaboration with ML and product teams

Do I need AI or machine learning experience?

You do not need to be an ML Engineer, but AI-related experience is a plus.

The role supports:

  • Recommendation engines

  • Chatbot products

  • AI agents

  • Data discovery for AI systems

  • Future AI-driven product features

Is Japanese required?

Based on the current requirements, the key language requirement is fluency in English.

Japanese is not listed as a requirement.

Is this role fully remote?

Yes. The role is fully remote.

This makes it a strong option for candidates who want flexibility while still working on complex data platform and AI infrastructure challenges.

What type of background is best?

A strong data engineering background is ideal, especially with experience in large-scale pipelines and data models.

However, the company is also open to candidates from related backgrounds, such as:

  • Software Engineering

  • Backend Engineering

  • Database Administration

  • BI Engineering

  • Data Science

  • Data Architecture

The most important thing is having strong experience with production-grade data systems.

 

Ready to apply?

Message us using this link !

 
Next
Next

Fullstack Engineer (AI × Data SaaS) in Japan: Everything You Need to Know