Building a Production E-Commerce Platform for Japan: Lessons from Kanhaji.jp

Every client who says 'we need a simple e-commerce website' is describing a system that is never simple. When that client is based in Japan — with GMO payment rails, Japanese address formats, and high UX expectations — the surface area expands fast.

What it actually takes to ship a cross-border commerce system under real client constraints — architecture, payments, localization, and the things that only break in production.

Every client who says "we need a simple e-commerce website" is describing a system that is never simple. When that client is based in Japan, serving Japanese customers, operating under Japanese payment rails, and expecting Japanese-grade UX quality — the surface area expands fast.

Kanhaji.jp was one of those projects. What started as a storefront became a full production platform: multi-currency catalog management, GMO and Square payment integrations, inventory synchronization, logistics APIs, and a deployment pipeline that had to stay stable for a real business with real customers.

This is the story of the engineering decisions that made it work.

The Problem That Actually Needed Solving

The client's core need seemed straightforward: sell products online in Japan. But unpacking that single sentence revealed several interlocking problems.

First, Japanese e-commerce is not Western e-commerce with different currency symbols. GMO Payment Gateway dominates the market — not Stripe, not PayPal. GMO has its own SDK, its own error code system, and documentation primarily in Japanese. Square was also in use for the client's physical store operations, which meant inventory changes in one system had to be reflected in the other.

Second, Japanese users have a high standard for site performance and UX fidelity. Slow initial loads, unrendered pages, or broken character encoding are trust signals — negative ones. Serving the right content fast wasn't a nice-to-have; it was table stakes.

Third, the client needed an admin interface. Product management, inventory control, order tracking, logistics integration — all of this had to be built alongside the customer-facing storefront.

This wasn't a template job. It was a full-stack production system with real users, real transactions, and a real business depending on it.

Architecture Decisions — and Why

Architecture and infrastructure

Next.js over a pure SPA

The storefront had to be indexable. Japan-based search engines, Google's Japanese index, and product-specific landing pages all required server-rendered HTML with proper meta tags, structured data, and Open Graph markup. A React SPA rendered entirely on the client would have made SEO an ongoing battle.

Next.js gave me server-side rendering where it mattered — product pages, category listings, landing pages — and client-side navigation once the user was on the site. The App Router let me colocate server components with client components without architectural gymnastics.

The alternative I seriously considered was a decoupled architecture — a static site generator for the storefront, separate React app for the admin panel. I rejected it because it would have doubled the deployment surface area, made shared component logic awkward, and created two separate authentication contexts to maintain. One Next.js codebase served both concerns cleanly.

MongoDB for flexible product schemas

Product catalogs in real e-commerce are messy. A socket set has SKU variants, dimension attributes, and packaging weights. A food product has expiry dates, storage requirements, and country-of-origin fields. Designing a SQL schema that accommodates both without constant migrations is painful.

MongoDB's document model meant each product could carry exactly the attributes it needed without altering a shared schema. The tradeoff was query complexity — aggregation pipelines replaced SQL joins. For analytics queries across order data, this required more deliberate query design. But for the primary use case — flexible product catalog management — the document model was the right call.

Redis for caching and session state

Two things warranted Redis:

Product catalog caching. Catalog queries involved aggregations across collections that were computationally expensive and rarely changed. Caching results with a TTL of several minutes eliminated repeated round-trips to MongoDB.
Session validation. JWT tokens stored in httpOnly cookies handled authentication, but Redis became a fast lookup for session validation on protected admin routes — sub-millisecond auth checks without hitting the database on every protected request.

The cache invalidation strategy was straightforward: on any product or inventory update via the admin panel, I explicitly invalidated the relevant cache keys rather than waiting for TTL expiry.

Docker for deployment consistency

The staging and production environments differed in subtle ways early in the project — Node version mismatches, environment variable handling inconsistencies. Docker eliminated this class of problem entirely. One Dockerfile, one image, identical behavior across environments.

In production on AWS EC2, the container ran directly. Not Kubernetes — the traffic volume didn't warrant the operational overhead. A single EC2 instance with Docker, managed by systemd, with Nginx in front, was sufficient and maintainable.

Nginx as the production gateway

Nginx sat in front of the Node.js application handling:

SSL termination via Certbot
gzip compression for static and dynamic responses
Rate limiting on the API surface
WebSocket proxying for the admin dashboard's real-time updates

One decision I'd make differently: I'd invest more time upfront in structuring rate limits by route category rather than applying a blanket limit.

The GMO Integration Problem

Payment integration

If there's one part of this project that took disproportionate time relative to its apparent scope, it was GMO. Unlike Stripe — which has excellent documentation, client libraries for every major language, and an active developer community — GMO's developer documentation is sparse in English. The error codes are documented primarily in Japanese.

The integration required several flows:

Initial payment authorization
Capture
Partial refunds
Subscription-style deferred billing for pre-orders

Each flow had its own API endpoint format, its own set of response codes, and its own sandbox behavior that didn't always match production behavior.

My approach was methodical: work through the Japanese documentation directly, map every possible error code to a user-facing message, build an explicit retry strategy for network-level failures, and log every API interaction with enough context to reconstruct any transaction from logs alone. The logging investment paid off multiple times during testing when sandbox transactions behaved unexpectedly.

Synchronizing GMO and Square inventory required a reconciliation layer. When an order was placed via the website (routed through GMO), inventory had to be decremented in both systems. Square's Catalog API was the source of truth, with a sync job that ran on order confirmation.

This introduced eventual consistency — not ideal — but acceptable for the client's volume. A proper message queue would have been the cleaner solution.

Japanese Localization Beyond i18n

Translation is the easy part of localization. The harder parts don't show up in i18n libraries.

Address format is the inverse of Western format: postal code → prefecture → city → ward → street number → building. A form that collects addresses in the wrong order signals to Japanese users that the product wasn't built for them.
Name ordering places family name before given name. User registration, order confirmation emails, and shipping labels all had to respect this. Small error, serious trust signal if wrong.
Character encoding required explicit attention throughout. MongoDB stores UTF-8 natively, but ensuring Japanese characters survived the full round-trip — from client input through Express middleware through MongoDB and back — required testing at each layer rather than assuming it worked.

Production Debugging at a Distance

Debugging a production system you can't physically access teaches you to invest in observability before you need it. My logging setup used Winston with structured JSON output, capturing request IDs that could trace a request from Nginx access log through the Next.js API layer to the Express backend to MongoDB.

One incident stands out. Orders placed during a specific time window were failing silently — no error surface to the user, but no confirmation either. The structured logs revealed that the GMO capture call was timing out at exactly 30 seconds — a timeout I'd configured in the HTTP client that matched GMO's processing time under load.

Increasing the timeout and adding an explicit retry on timeout resolved it.

Without the logs, I'd have been guessing.

What I'd Build Differently

Order processing via a message queue. Order confirmation, inventory update, email dispatch, and logistics submission currently happen synchronously in a single Express handler. Bull with Redis would decouple these jobs, allow retries on failure, and give visibility into job status.
CDN for static assets. Product images served directly from EC2. CloudFront would reduce latency for Japanese users and offload the origin server.
Granular cache invalidation. The current approach invalidates entire cache namespaces on updates. Tag-based invalidation would be more precise as the catalog grows.

Key Takeaways

SSR is not optional when SEO is a business requirement. Next.js handles both server and client cleanly without architectural compromise.
Redis caching is most effective when paired with explicit invalidation logic, not just TTLs.
Payment integrations from non-English-dominant markets require more investment than their Stripe equivalents. Plan for it.
Localization is deeper than string translation — it's address formats, name ordering, date formats, and cultural UX expectations.
Observability investment at build time pays dividends in production debugging. Log structured data from day one.
Docker eliminates environment drift. It's not complexity — it's discipline.

Conclusion

Kanhaji.jp is a production system serving real transactions for a real business. The architecture decisions — Next.js for SEO, MongoDB for catalog flexibility, Redis for performance, Docker for consistency, Nginx for the production gateway — each solved a real problem with an appropriate tool.

The most interesting engineering wasn't in the choices that worked smoothly; it was in the choices that revealed their tradeoffs under production load. Every system teaches you something.

This one taught me to log everything, plan for payment API oddities, and never underestimate the engineering depth of "just a simple e-commerce site."