ABOUT THE ROLE
As an Associate Data Analyst, you will help transform unstructured permitting and review data into actionable insights that improve government efficiency and customer outcomes. Working with modern AI and machine learning technologies, including natural language processing (NLP) and large language models (LLMs), you will analyze permit review patterns, identify opportunities to reduce delays, and develop data-driven recommendations for government agency stakeholders. This role offers the opportunity to work with cutting-edge AI technologies while helping public sector organizations deliver services more effectively.
SPECIFIC RESPONSIBILITIES
- Analyze permit submission comments and rejection patterns to identify the most common causes of delays and opportunities to improve review cycle times.
- Build, maintain, and refine NLP pipelines that classify reviewer comments into meaningful topics using clustering techniques and LLM-based approaches.
- Design, test, and optimize prompts for Azure OpenAI and other AI models to classify, summarize, and extract insights from unstructured text.
- Develop metrics, dashboards, and reports that translate data analysis into actionable recommendations for internal and external stakeholders.
- Write and optimize SQL queries and Python-based ETL processes to extract, transform, and load data from SQL Server and cloud-based storage environments.
- Evaluate AI and machine learning model outputs for quality, accuracy, consistency, and appropriate handling of sensitive information.
- Design and execute experiments to measure and improve the effectiveness of classification models, prompts, and analytical approaches.
- Partner with cross-functional teams to support data-driven decision-making and continuous improvement initiatives.
REQUIRED QUALIFICATIONS
- Bachelor's degree in Data Science, Computer Science, Statistics, Computational Linguistics, or a related field.
- 1-2 years of experience in data science, analytics, or a related field, including internships, academic projects, or professional experience.
- Proficiency in Python, including experience with libraries such as pandas, numpy, and scikit-learn.
- Experience writing SQL queries, including joins, aggregations, and common table expressions (CTEs).
- Foundational understanding of natural language processing concepts, including text classification, clustering, and embeddings.
- Exposure to large language models (LLMs), including prompt engineering and output evaluation.
- Strong analytical and problem-solving skills with the ability to communicate findings clearly to both technical and non-technical audiences.
- Familiarity with Git and version control best practices.
DESIRED QUALIFICATIONS
- Experience working with Azure OpenAI, OpenAI APIs, Google Gemini, or similar AI platforms.
- Hands-on experience with prompt engineering techniques, including few-shot learning and system prompt design.
- Familiarity with text embedding models, clustering algorithms, and NLP frameworks such as sentence-transformers, HDBSCAN, UMAP, or Top2Vec.
- Experience building or supporting ETL pipelines and data processing workflows.
- Exposure to Power BI, Parquet files, Azure Blob Storage, or similar cloud-based analytics tools.
- Understanding of personally identifiable information (PII) handling requirements and responsible AI practices.
- Experience working with government data, permitting systems, civic technology, or other public sector environments.