AI Data Architect (Remote)
Great Day Improvements, LLC
AI Data Architect (Remote)
- Brand Name
- Great Day Improvements (Corp/Shared Services)
- Category
- Information Technology
- Min
- USD $155,000.00/Yr.
- Max
- USD $165,000.00/Yr.
Overview
Great Day Improvements - AI Data Architect (Remote, USA)
Company Overview
Since its founding 13 years ago, Great Day Improvements, LLC has grown rapidly toward its vision of becoming one of the largest home improvement companies in the U.S. Headquartered in Twinsburg, Ohio, Great Day Improvements is a $1.5 billion, vertically integrated, direct-to-consumer provider of premium home improvement products.
The company’s family of brands includes Patio Enclosures®, Champion Windows and Home Exteriors®, Universal Windows Direct®, Apex Energy Solutions®, Stanek Windows®, Leafguard®, Englert®, and The Bath Authority.
With an expanding workforce of over 4,800 employees across 130 metropolitan markets throughout the U.S., Great Day Improvements continues to rank among the top home improvement companies nationwide and is one of the fastest growing private companies in America.
Summary
This role is foundational to scaling trusted AI across the enterprise. The AI Data Architect owns how enterprise data is structured, enriched, and retrieved to power AI-driven decisioning across Great Day Improvements. This role focuses on building scalable Retrieval-Augmented Generation (RAG) systems that ensure AI outputs are accurate, consistent, and grounded in authoritative data. The AI Data Architect defines chunking strategies, metadata schemas, hybrid retrieval logic, and authority ranking models that form the foundation of trustworthy AI at Great Day Improvements.
The ideal candidate will have deep experience in data architecture, semantic search, and knowledge systems, with hands-on expertise in vector databases, embedding models, and retrieval pipeline design. They will lead the translation of business domain knowledge, such as HR policies, call center procedures, and operational workflows, into structured data models, taxonomies, and retrieval logic that AI systems can reliably interpret. This role establishes and enforces the data architecture standards that engineering, product, and operations teams rely on, ensuring every AI interaction is well-governed, explainable, and continuously improving.
This role requires a systems thinker who is self-motivated, comfortable navigating ambiguity, and energized by the challenge of making enterprise data truly AI-ready. The ideal candidate will embrace emerging technologies including knowledge graphs, agentic AI architectures, and LLM Ops observability tooling, and will be enthusiastic about building the data foundations that enable Great Day Improvements to scale AI across its growing portfolio of brands. This role defines and governs the architecture and standards for AI data systems, while partnering with engineering and platform teams responsible for implementation and execution.
Salary Range: $155,000 - 165,000
This salary range represents a good faith estimate of the compensation for this position. Actual compensation may vary based on education, experience, knowledge, skills, abilities, internal equity, and alignment with market data
Responsibilities
RAG System Design (Core Responsibility)
- Design end-to-end Retrieval-Augmented Generation (RAG) architecture, including ingestion, chunking, embedding, indexing, retrieval, and response generation
- Define chunking strategies based on content type, semantic coherence, and use case requirements
- Build metadata schemas, tagging frameworks, and document structures to optimize retrieval precision
- Develop hybrid retrieval strategies combining vector similarity, keyword search, metadata filters, and graph-based reasoning
- Implement reranking logic and relevance scoring to optimize answer accuracy and grounding
- Establish retrieval pipelines that consistently return high-quality, contextually relevant results across enterprise use cases
Data Structuring & Normalization
- Own upstream data preparation standards that enable effective retrieval, clearly separating data structuring responsibilities from downstream retrieval and RAG execution
- Define standards for document ingestion, cleaning, parsing, and normalization across structured and unstructured enterprise data prior to retrieval
- Transform raw enterprise data (PDFs, knowledge bases, policies, call transcripts, wiki pages) into AI-ready formats
- Create canonical document structures and semantic representations prior to vectorization
- Standardize taxonomy, terminology, and metadata across business domains to ensure consistency at scale
- Design and maintain ontologies and knowledge graphs that enrich retrieval context and reduce hallucinations
- Define and govern the onboarding, validation, and lifecycle management of enterprise data sources, including approval, updates, and deprecation of content used in AI systems
- Define and enforce data quality standards for enterprise content, including completeness, consistency, accuracy, and maintainability of data used in AI systems
Business Logic to AI Translation
- Lead the extraction, structuring, and codification of business domain knowledge for AI consumption
- Translate business rules into metadata models, labeling strategies, and retrieval logic
- Define how different content types (policies, FAQs, procedures, product documentation) are interpreted, prioritized, and surfaced by AI
- Define and enforce alignment of AI behavior with real-world business intent, decision logic, and operational workflows
Authority, Governance & Trust
- Define source-of-truth hierarchies and authority ranking models across content repositories
- Implement version control, document freshness tracking, and conflict resolution strategies for overlapping content
- Define access control logic (SSO, role-based access) within retrieval workflows to ensure data security and compliance
- Define standards to ensure AI responses are traceable, explainable, and grounded in authoritative, auditable sources
- Define data lineage and provenance tracking standards to support governance and regulatory requirements
- Drive adoption of AI data architecture standards across engineering, product, and business teams, ensuring compliance with defined data, retrieval, and governance models
LLM & Retrieval Orchestration
- Define and govern where and how LLMs are used across the pipeline (classification, routing, summarization, answering)
- Balance cost, latency, and performance across model usage, including token optimization strategies
- Define and enforce query routing strategies based on user intent (policy lookup, FAQ, transactional, analytical)
- Own and optimize orchestration between retrieval systems, LLMs, and agentic AI workflows
- Own evaluation and integration of emerging orchestration frameworks and Model Context Protocol (MCP) standards
Evaluation & Continuous Improvement
- Own evaluation frameworks for retrieval accuracy, answer quality, and grounding using tools such as RAGAS, LangSmith, or Langfuse
- Own the creation and maintenance of domain-specific test sets across business areas (HR, call center, operations, product knowledge)
- Own analysis of failure cases and continuous improvement of retrieval strategies, chunking approaches, and data structuring
- Define measurable performance standards for precision, recall, grounding, consistency, and latency
- Own observability and monitoring pipelines to track retrieval and LLM performance in production
Qualifications
Required Qualifications
- 5+ years of experience in data architecture, search systems, knowledge engineering, or applied AI, with demonstrated experience designing and driving adoption of scalable data or AI architectures across teams or domains
- Hands-on experience designing, building, or optimizing RAG systems or semantic search platforms
- Strong understanding of vector embeddings, their generation, storage, and limitations across different embedding models
- Demonstrated experience with chunking strategies and the tradeoffs between granularity, context preservation, and retrieval quality
- Proficiency in hybrid retrieval approaches combining vector similarity, keyword search, and metadata filtering
- Experience with reranking techniques and relevance tuning for production retrieval systems
- Experience designing metadata schemas, taxonomies, ontologies, or knowledge graphs for enterprise data
- Proven ability to work with unstructured enterprise data (documents, PDFs, knowledge bases, transcripts, wikis)
- Experience designing and working with vector databases and search platforms (e.g., Pinecone, Weaviate, Qdrant, Elasticsearch, FAISS)
- Working knowledge of LLM APIs, prompt engineering, and orchestration patterns, with the ability to evaluate and adapt across frameworks (e.g., LangChain, LlamaIndex)
- Familiarity with data pipelines, ETL/ELT processes, and API architecture at a systems design level
- Understanding of access control, data security, and compliance considerations in AI-powered data systems
Preferred Qualifications
- Experience with knowledge graph technologies (e.g., Neo4j, RDF/OWL, SPARQL) and GraphRAG architectures
- Familiarity with agentic AI frameworks (e.g., LangGraph, CrewAI, AutoGen) and multi-agent system design
- Experience with LLMOps and observability tooling (e.g., LangSmith, Langfuse, RAGAS evaluation frameworks)
- Proficiency in Python for data processing, pipeline scripting, and integration tasks
- Experience with cloud AI services on AWS, Azure, or GCP (e.g., Amazon Bedrock, Azure AI, Vertex AI)
- Background in the home improvement, manufacturing, or direct-to-consumer industry
- Experience with Model Context Protocol (MCP) or similar standards for AI-to-tool interoperability
- Master’s degree in Computer Science, Data Science, Information Science, or a related field
Competencies
- Systems Thinking: Designs end-to-end architectures across ingestion, retrieval, and orchestration rather than isolated components
- Business Translation: Converts ambiguous domain knowledge and business rules into structured logic, metadata models, and retrieval strategies
- Data Modeling Mindset: Builds scalable, reusable schemas, taxonomies, and ontologies that serve multiple AI use cases
- Quality Ownership: Drives accuracy, consistency, grounding, and trust in AI outputs through rigorous evaluation and continuous improvement
- Cross-Functional Collaboration: Communicates complex technical concepts to non-technical stakeholders and partners effectively across engineering, product, and business teams
- Adaptability: Stays current with rapidly evolving AI technologies, frameworks, and best practices and applies them pragmatically to enterprise challenges
Success Measures
Success in this role is measured by:
- Establishment and adoption of a scalable metadata and taxonomy framework within the first 60 days
- A defined, documented, and enforced authority ranking and governance model across enterprise content sources
- Measurable improvements in retrieval accuracy across priority use cases (e.g., HR policies, call center knowledge, product documentation)
- A repeatable evaluation framework for RAG performance with defined benchmarks for precision, grounding, and consistency
- Reduction of irrelevant, inconsistent, or conflicting AI responses through conflict resolution and source prioritization logic
- Delivery of a documented AI data architecture blueprint that scales across new business domains and use cases
GDI is an Equal Employment Opportunity Employer
#INDGDI
#INDGDI
Options
Software Powered by iCIMS
www.icims.com