Data Engineer in Japan: REMOTE + NO JAPANESE REQUIRED
Let’s be clear.
This blog is about Data Engineers in Japan.
What they do, how they do it, and how you can do it too.
Got it?
Good.
There’s a lot to cover, so let’s get started.
What is this role?
A fast-growing travel tech company is looking for a Senior Data Engineer to help build and improve the data platform behind its global hospitality products.
The company develops technology used by hotels to streamline operations, improve efficiency, and drive sales growth. Behind those products is a modern data platform that supports internal business intelligence, recommendation engines, AI chatbot products, and future AI agent use cases.
In this role, you will join a small but highly experienced ML/Data/AI team and work closely with two veteran Data Engineers, ML Engineers, product managers, and internal stakeholders.
This role is especially focused on:
Building and improving large-scale data pipelines
Modeling complex hospitality industry datasets
Improving the company’s data catalog and metadata
Helping humans and AI agents discover and query data more effectively
Supporting AI-driven products such as recommendation engines and chatbots
Improving data platform performance and cost efficiency
A major focus of the role is making the company’s data easier to discover, understand, and query — not only for humans, but also for AI agents that need to find the right data and construct accurate, high-performance queries.
What you will do
This role sits at the intersection of data engineering, data architecture, AI infrastructure, and hospitality technology.
You will help the company ingest and model new external data sources that provide a deeper view of the hospitality industry. These datasets can support everything from business intelligence to AI-powered product features, recommendation engines, and chatbot experiences.
Key areas of work include:
Ingesting new external data sources related to the hospitality industry
Designing data models for critical business and product entities
Improving the usability of the company’s data platform
Helping internal users find and query the right data
Supporting ML and AI teams with reliable data features and pipelines
Reducing unnecessary compute costs through better modeling and query design
This is not just a pipeline-building role. You will also help shape how the company organizes, explains, and exposes data across the business.
Key Responsibilities
As a Senior Data Engineer, you will design, develop, and maintain robust batch and streaming data pipelines using tools such as Apache Beam, Spark, and BigQuery.
Your main responsibilities will include:
Designing, developing, and maintaining batch and streaming data pipelines
Processing large-scale hospitality industry datasets
Improving data models for the company’s most important entities
Making sure data consumers can query accurate data efficiently
Optimizing data models and queries to control compute costs
Working with data producers to improve metadata and data discoverability
Supporting data catalog and platform initiatives
Helping humans and AI agents find, understand, and use the right data
Collaborating with ML Engineers on data features and pipelines
Supporting AI products, recommendation engines, and chatbot-related features
You will also play an important role in making the data platform more reliable, scalable, and useful for teams across the organization.
TechStack
The team works with a modern data engineering stack. You do not necessarily need hands-on experience with every tool, but you should have strong experience building scalable data systems and be comfortable learning new technologies.
The main technical stack includes:
Data warehouse: BigQuery
Processing: Apache Beam / Dataflow, Spark, DuckDB
Orchestration and transformation: dbt, Airflow, Prefect
Languages: Python, Scala, or Java
Cloud and storage: GCS and AWS
Relevant AWS services: AWS Glue and related data services
Experience with BigQuery, Apache Beam, Spark, Dataflow, AWS Glue, dbt, Airflow, or Prefect would be especially relevant.
Required Experience
The ideal candidate will have around 5+ years of combined experience in data engineering, data architecture, or software engineering.
This role is especially suited for someone who has hands-on experience developing data pipelines and data models at a large scale. Experience in technology-driven environments such as major internet companies, e-commerce platforms, large retailers, or high-volume data organizations would be highly relevant.
Relevant backgrounds may include:
Data Engineering
Data Architecture
Backend Engineering
Software Engineering
Database Administration
BI Engineering
Data Science with production data system experience
A background in software engineering is a strong plus because the team values engineering best practices, maintainability, scalability, and clean system design.
The company is especially interested in candidates who can show:
Experience building production-grade data pipelines
Strong understanding of data modeling
Ability to work with large-scale datasets
Experience improving data quality, performance, and reliability
Ability to lead technical projects
Clear communication with technical and non-technical stakeholders
Interest in modern data platforms and AI-related infrastructure
Language Requirements
This role requires fluency in English.
The team includes engineers and product managers from diverse backgrounds, so you should be comfortable using English in technical discussions and cross-functional collaboration.
You should be able to:
Explain technical ideas clearly in English
Discuss architecture and data platform decisions
Collaborate with engineers, ML teams, and product managers
Communicate trade-offs around performance, cost, and reliability
Work effectively in an international team environment
Based on the current requirements, Japanese is not listed as a requirement.
Nice-to-Have Experience
Experience with AI-related data infrastructure would be a strong advantage.
This includes experience or interest in:
RAG
Context-aware AI systems
Agent orchestration
Embeddings
Knowledge graphs
Vector databases
Data catalogs
Semantic layers
Metadata management
AI-ready data infrastructure
You do not need to be an ML Engineer, but you should be interested in how data platforms can support the next generation of AI-powered products.
Because part of the role involves making data more discoverable and usable by AI agents, experience thinking about metadata, structured data access, and query accuracy would be highly relevant.
Who This Role Is Good For
This role is a strong fit for someone who enjoys building data systems that are both technically solid and directly useful to the business.
You may be a good match if you:
Have built large-scale batch or streaming data pipelines
Care about data quality, modeling, reliability, and performance
Enjoy reducing complexity and making data easier for others to use
Want to work on AI-adjacent data infrastructure
Like working in a small, senior team where your decisions have visible impact
Prefer a fully remote work style
Are interested in travel tech, hospitality data, and global products
Want to work with experienced engineers on complex data platform problems
This is not just a maintenance role. You will be helping shape how the company’s data platform evolves as AI agents, recommendation systems, and chatbot products become more important to the business.
How to Stand Out as an Applicant
To stand out, you should show that you can think beyond pipeline implementation.
The strongest candidates will be able to explain how they have designed data systems that are reliable, scalable, cost-conscious, and easy for others to use.
In your resume and interviews, it would be helpful to highlight:
Large-scale data pipelines you have built or maintained
Data models you designed or improved
Query performance improvements
Cost optimization projects
Data catalog or metadata initiatives
Work with ML Engineers or AI-related teams
Experience with BigQuery, Spark, Beam, Dataflow, AWS Glue, dbt, Airflow, or Prefect
Examples of leading technical projects
Times you worked with business or product stakeholders
The scale of the systems you worked on, such as data volume, number of pipelines, or business impact
AI-related experience is also worth emphasizing, especially if you have worked with:
RAG
Embeddings
Vector databases
Knowledge graphs
AI agents
Recommendation systems
Chatbot data infrastructure
The key is to show not only what you built, but why it mattered and how it improved the platform, product, or business.
Career Path
This role can be a strong next step for an experienced Data Engineer who wants to move into more strategic data platform ownership.
From this role, possible career paths include:
Lead Data Engineer
Data Architect
Data Platform Lead
Analytics Engineering Lead
AI/Data Infrastructure Lead
Engineering Manager for Data or Platform teams
Technical Lead for AI infrastructure initiatives
Because the team is small and senior, you will likely have opportunities to influence architecture, improve platform standards, and shape how the organization uses data.
This could also be a good transition role for someone coming from:
Backend engineering into data platform engineering
DBA work into modern cloud data infrastructure
BI engineering into large-scale data architecture
Data science into production-grade data engineering
Data engineering into AI infrastructure
For someone who wants to move from implementation-focused engineering into broader technical leadership, this role offers a clear path.
Compensation and Work Style
The salary is up to ¥12M JPY, depending on experience.
The role is fully remote, making it suitable for candidates who want flexibility while still working on complex, high-impact technical challenges.
Key details include:
Salary: Up to ¥12M JPY
Work style: Fully remote
Team: 5-person ML/Data/AI unit
Environment: International and engineering-driven
Focus: Data platform, AI infrastructure, hospitality datasets, and product impact
You will be part of a diverse engineering and product environment, working with people who are building unique products for the hospitality and travel industry.
Interview Process
The interview process is structured and practical.
The expected process includes:
Initial Screen
A 30-minute conversation with the Hiring Manager. This is usually an opportunity to discuss your background, motivation, technical experience, and fit for the role.Practical Assessment
A take-home assignment designed to help the team understand how you approach real data engineering problems.Technical Interview
A deep-dive interview with the two current Data Engineers. This stage will likely explore your technical experience, system design thinking, data modeling ability, and approach to scalable data pipelines.Final Interview
A final leadership review to assess overall fit, communication style, and alignment with the team’s goals.
To prepare, you should be ready to discuss:
Past data pipeline projects
Data modeling decisions
Big data architecture
Query optimization
Cost-performance trade-offs
Technical leadership examples
Collaboration with ML, product, or business teams
FAQ
Is this a pure data pipeline role?
No. Pipeline development is important, but the role also involves:
Data modeling
Platform improvements
Metadata and data catalog work
Cost optimization
AI-ready data infrastructure
Collaboration with ML and product teams
Do I need AI or machine learning experience?
You do not need to be an ML Engineer, but AI-related experience is a plus.
The role supports:
Recommendation engines
Chatbot products
AI agents
Data discovery for AI systems
Future AI-driven product features
Is Japanese required?
Based on the current requirements, the key language requirement is fluency in English.
Japanese is not listed as a requirement.
Is this role fully remote?
Yes. The role is fully remote.
This makes it a strong option for candidates who want flexibility while still working on complex data platform and AI infrastructure challenges.
What type of background is best?
A strong data engineering background is ideal, especially with experience in large-scale pipelines and data models.
However, the company is also open to candidates from related backgrounds, such as:
Software Engineering
Backend Engineering
Database Administration
BI Engineering
Data Science
Data Architecture
The most important thing is having strong experience with production-grade data systems.