HalalCodeCheck Case Study: Privacy-First Halal E-Code Verification App

Introduction

Muslim consumers checking ingredient labels in grocery stores face a recurring challenge: determining whether food additives-commonly listed as E-codes-comply with halal dietary requirements. This requires reading small label text and understanding the halal status of hundreds of additives across different interpretations of Islamic dietary law. Millions of consumers globally experience this friction regularly.

HalalCodeCheck addresses this with a verification tool supporting three input methods: camera-based scanning, voice search, or manual code entry. The product maintains a verified database of 360+ E-codes and enables verification in 30–90 seconds. Processing happens entirely in the browser using local OCR, so images and OCR text never leave the user’s device.

The recent product direction expands beyond raw verification into guided decision support for first-time users. We shipped a user-first overhaul focused on onboarding clarity, trust signaling, and measurable product learning so we can iterate based on behavior and direct feedback, not assumptions.

The Problem & Vision

User Challenges

The shopping experience for halal-conscious consumers involves multiple friction points. Reading ingredient labels requires squinting at tiny text, understanding E-code numbering systems, cross-referencing uncertain sources, and making purchasing decisions under time pressure in busy retail environments.

User research revealed three critical requirements: speed (verification within a shopping timeframe of 30-90 seconds), accuracy (OCR accuracy target >85% for E-code extraction), and privacy (users expressed discomfort uploading food images to cloud servers due to concerns about data retention and analysis).

Solution Design

HalalCodeCheck implements three complementary input methods to address different user contexts. Scan mode uses camera-based optical character recognition for products in hand. Voice mode enables hands-free operation while navigating store shelves - users speak the E-code number naturally. Manual search provides direct database lookup when the user already knows the code number.

The architectural decision to process all images locally using Tesseract.js in the browser was non-negotiable. This approach eliminates cloud dependencies, ensures no sensitive data transmission, and removes server costs for OCR processing.

Design Principles

Accurate (multi-pass OCR targeting >85% extraction), fast (30–90s verification), private (local-first, no image/OCR transmission), and accessible (scan, voice, manual) for different shopping scenarios.

Product Development Journey

Initial Concept

The MVP prioritized core functionality: local OCR processing for privacy, a mobile-first interface optimized for in-store use, and fast database lookup from a curated E-code dataset. The MVP development began with single-pass OCR and simple text input, establishing baseline performance and user workflows.

User Testing & Feedback

Early testing revealed critical limitations: single-pass OCR missed E-codes on complex labels; users reported voice input was 2–3× faster than typing; and there was demand for batch verification from a single label. Decisions that followed: keep OCR local for privacy, add voice for hands-free use, and implement multi-pass OCR with early exit when high-confidence results are found.

Design & Iteration

The design system matured through four iterations - halal status visualization (green/red/yellow), typography for small screens, accessible touch targets, and contrast for sunlight readability. The product evolved in parallel: V1 single-pass OCR and manual search → V2 multi-pass + preprocessing → V3 voice and batch verification → V4 performance (bundle splitting, image compression, early exit). Each step was driven by user feedback and technical bottlenecks.

Recent Product & UX Updates (Sprint Summary)

We recently shipped a user-first overhaul focused on onboarding, trust, and measurable product learning. The core shift was moving from “data display” to “guided decision support” for first-time visitors.

1) Homepage Reframed for New Users

The homepage messaging now assumes many visitors do not already understand E-codes. Copy and hierarchy were rewritten to answer core first-visit questions quickly: what is being checked, when to use it, and how it helps before a purchase decision.

2) Trust & Credibility Layer Added

A dedicated trust-proof section was added to the homepage to reduce hesitation and uncertainty. Guidance language and expected outcomes were clarified, and sensitive claims were refined to keep communication clear, accurate, and safer.

3) Workflow Improvements Across Core Pages

We tightened post-result flows so users understand what to do next instead of stopping at a status label. Feedback prompts were moved closer to the decision moment, and interaction states (including Yes/No feedback actions) were made more explicit for faster comprehension.

4) Section-Level Analytics Instrumentation

We implemented section-level event tracking to measure engagement and drop-off by page block. Newly added feedback interactions are now instrumented, enabling practical CRO iteration based on observed behavior.

5) Feedback System Expanded and Operationalized

Feedback capture (rating plus optional comment) is now available across results pages, E-code detail pages, and food category pages. Feedback is persisted in Supabase, making qualitative insight queryable for UX and content prioritization.

6) Abuse-Resistant Feedback Backend

To keep input actionable, feedback persistence now runs through secure RPC logic with server-side safeguards:

Cooldown and rate limiting
Daily submission caps
Duplicate detection via comment hash
Basic spam and link-pattern checks

These controls reduce noise while preserving legitimate user insight.

Product Strategy Direction

The strategy is now explicit:

Build around real user uncertainty, especially first-time visitors
Close the loop: ship, observe, collect feedback, and iterate
Combine quantitative analytics with qualitative comments for roadmap decisions

Technical Deep-Dive: Multi-Pass OCR Strategy

The Challenge

Single-pass OCR produced unreliable results across diverse label environments: missed E-codes on complex backgrounds, character misreads (O→0, I→1), and processing times of 80-240 seconds. The root cause was that a single preprocessing approach couldn’t handle varying lighting, background colors, and text sizes found on real product labels.

Solution: Multi-Pass with Early Exit

Rather than accepting single-pass results, the system attempts multiple preprocessing and recognition combinations (6 maximum passes). The key innovation: intelligent early exit. Once the system detects 3+ E-codes with reasonable confidence, it stops processing immediately instead of completing all passes.

How It Works: Multiple preprocessing strategies handle different label conditions; results are scored by E-code count and confidence; early exit triggers ~80% of the time (30–60s). When it doesn’t, the system completes remaining passes for edge cases.

Technical Deep-Dive: Voice Recognition Integration

User Need & Challenge

Shoppers need quick input while hands are occupied - holding products, pushing carts, or handling wet/sticky fingers. Converting natural speech into structured E-code data requires handling multiple input formats (“E100”, “100”, “E160a”, “one hundred”), accent variations, and background noise in retail environments.

Solution: Web Speech API Integration

HalalCodeCheck uses the browser-native Web Speech API for voice recognition without external dependencies. The system accepts natural speech patterns-users can say “E100”, “100”, or “one hundred” and the system normalizes all variations to standard E-code format.

Error Handling:

Microphone permission denied → fallback to manual input with clear explanation
Offline → system detects and prompts user to reconnect
Unrecognized speech → user sees what was detected and can retry or switch methods

The implementation prioritizes accessibility: hands-free input helps users with mobility limitations, and pattern matching accommodates regional accents.

Technical Deep-Dive: Database Curation & Maintenance

Complexity at Scale

Maintaining halal status for 360+ E-codes involves significant complexity. Food additives have different status classifications depending on the school of Islamic jurisprudence (madhab). Some E-codes have unambiguous status (E621 monosodium glutamate is universally accepted as halal), while others like certain colors (E120 carmine) have divided opinions. The database requires cross-referencing multiple authoritative sources while documenting reasoning for non-obvious classifications.

Database Statistics

Total E-codes: 360+ entries with status classifications
Status Distribution: Halal (~200+), Haraam (~10), Mushbooh (~150+), Unknown (varies)
Categories: Food colors, preservatives, antioxidants, emulsifiers, thickening agents, sweeteners
Source References: Cross-referenced across multiple authoritative sources per entry

Performance Optimizations

Performance improvements came from multiple focused optimizations rather than a single breakthrough.

Image Compression: Smartphone photos (2-5MB) were compressed through canvas-based processing to 200-500KB without perceptible quality loss for OCR. This reduced bandwidth consumption and device processing time.

OCR Processing: The multi-pass strategy with early exit logic (described in the OCR section) completed faster on average than single-pass approaches by stopping as soon as high-confidence results were found.

Bundle Size: The initial 677KB bundle was reduced to ~200KB through strategic code splitting - separating vendor libraries, feature modules, and lazy-loading heavy dependencies like Tesseract.js (loaded only when users attempt a scan).

Performance metrics before and after optimization.

Metric	Before	After	Improvement
Bundle Size	677KB (190KB gzipped)	~200KB main + chunks (~182KB gzipped)	70% reduction
Image Size	2-5MB raw	200-500KB compressed	85-90% reduction
Lighthouse Score	Baseline	+6-12 points	Significant improvement
Time to Interactive	4-6 seconds	1.5-2 seconds	65% faster

Privacy & Security Architecture

Local-First as Architectural Constraint

Privacy was a design decision from day one: images never leave the device; processing happens in the browser. That constraint drove technical decisions throughout.

Data Handling:

Images are processed in-browser using Tesseract.js; no image transmission to servers
OCR text results are not stored on servers; results exist only in the user’s session
E-code lookups happen against a local database bundled with the application
Optional analytics (usage metrics) only with explicit user consent

What Data Is Stored: Non-sensitive metadata only with user consent: E-codes checked, verification method used, and timestamp. This enables product improvements without capturing information about what products users are buying.

Security Measures:

HTTPS required for all connections (camera/microphone access)
Explicit permission prompts explaining camera and microphone usage
All sensitive operations performed locally - no data transmission

Deployment & Infrastructure

HalalCodeCheck is deployed on Cloudflare Pages for global edge network distribution, automatic scaling, and automated deployment on git push. The build process uses Vite for production optimization with static assets cached globally for fast delivery. Build artifacts are immutable and versioned, enabling quick rollback if needed.

Results & Impact

Technical Achievements

Image Processing: 85-90% size reduction through compression while maintaining OCR accuracy
Bundle Optimization: 70% reduction in initial bundle size (677KB → 200KB main bundle)
Page Performance: 65% improvement in Time to Interactive (4-6s → 1.5-2s on mobile networks)
Architecture: Fully functional offline-capable application with no cloud dependencies for core functionality

User Impact

Users can complete halal verification in 30-90 seconds depending on input method. The multi-input approach (scan, voice, manual) reduces friction for different in-store scenarios. The onboarding-first content model reduces first-visit confusion, while trust-proof messaging improves confidence in the decision process. Embedded feedback points and clearer next-step flows make the product more usable and easier to improve continuously.

Product Learning Impact

Section-level analytics now identify where users engage or drop off
Stored feedback comments in Supabase make qualitative patterns queryable
Anti-abuse controls improve feedback quality for decision-making
The team can run faster, evidence-based CRO and content/product experiments

Future Roadmap

Planned enhancements include:

Premium tier features (detailed sourcing, multiple interpretations)
Integration with product barcode scanning for automatic ingredient lookup

Conclusion

HalalCodeCheck demonstrates solving a well-defined problem with technical excellence, then evolving into a feedback-driven product system. Five principles emerged:

Local-first architecture builds user trust: Processing on-device with no image transmission addresses privacy without requiring trust in backend infrastructure.

Performance optimization is iterative: Speed came from image compression, bundle splitting, OCR early exit, and lazy loading, not a single breakthrough.

Multiple input methods address real contexts: Voice wasn’t a nice-to-have; testing showed it was significantly faster than typing in shopping scenarios.

Database quality requires ongoing maintenance: The 360+ E-code dataset needs careful curation and source documentation; that’s domain expertise, not just engineering.

Learning systems compound product quality: Section analytics plus stored qualitative feedback create a reliable loop for prioritizing UX and content improvements.

Clear problem definition, execution that doesn’t compromise speed or privacy, and a disciplined ship-observe-iterate cycle made this project stronger over time.

HalalCodeCheck