Project showcase

About this project

# Application Agent for cover letters ## Describtion of the Use Case ### Primary use The Application agents purpose is to automate the process if writing individual cover letters for job applications, based on the Curriculum Vitae (CV) data from a user and a provided jod ad. It archievs to generate a ready to go clean formated cover letter PDF, that clearly discribes how your experiences and skills match requirements and tasks from the job ad. This feature actually consits out of two agents, [Job ad extraction agent](#Job-ad-extraction-agent) and [Cover letter writing agent](#Cover-letter-writing-agent), that can be started independed or consecutively. ### Secondary uses [Extracting CV from LinkedIn](#LinkedIn-CV-extract-agent), for the primary feature (cover letter generation) it is necessary to provid the agent with Curriculum Vitae information like professional experience and educational background. An additional agents purpose is to extract this from your LinkedIn profile. The agent archievs to give you a good start, but some additional editing might be necessary to proceed with the cover letter agent. [Manuell editing](#Manuell-editing) is a feature that allows you to change and add all the Curriculum Vitae and personal data comfortable in the webapp. [Job Search Agent](#Job-search-Agent) is an additional agent, whichs purpose is to help to find job ads you want to apply to. It archies to search online in the database of the German **Agentur für Arbeit** and visualising the results in a table in the webapp. Necessay parameters for the search are a some kind of job title and a German city as location. Optional the search area as radius around the city center can be adjusted. ## Prerequisites ### Genral Prerequisites **OpenAI API Key** The Agent is using the OpenAI API, therefore an API key is needed. Store the API key as environment varaible `OPENAI_API_KEY` so the app will then load it automatically. **Configure your passwort** to secure your applaction the API keys and cookies, add an environment varaible `APPLICATION_AGENT_PW` with your passwort. You have to identify your self with that passwort in the streamlit app. **Docker** Docker need to be installed. **LinkedIn Cookie (optional)** To enable some features like CV extraction from LinkedIn a cookie is needed, so the app can log into your acount, without you sharing the passwort. To do so copy the cookie `li_at` from your browser with you logged in on LinkedIn. (Open the menu with F12) Store the cookie as environment varaible `LINKEDINCOOKIE`, so the app will then load it automatically. ![image](docu/linkedin_cookie.png) ### Option to run the app in Docker No additional installations are necessary. ### Option to run the app locally direcly with uv Addtional installations are necessary. **Latex compiler** For pdf generation a Latex compiler is required install one like TexLive and make sure that the `luatex` binary is added to path. **UV** is needed to run the application, it also automatically takes care of the python dependencies. ````shell curl -LsSf https://astral.sh/uv/install.sh | sh ```` ## Starting the application agent from terminal ### Run with Docker First make sure all the [Genral Prerequisites](#Genral-Prerequisites) are taken care of. Because the project is not yet in a public registry availabel, you have to build it first locally. ````shell sudo docker build . --tag 'ro6in/application-agent:latest' ```` After that the container can be run with: ````shell sudo OPENAI_API_KEY=$OPENAI_API_KEY \ LINKEDINCOOKIE=$LINKEDINCOOKIE \ APPLICATION_AGENT_PW=$APPLICATION_AGENT_PW \ docker run -t -i \ -e "LINKEDINCOOKIE" \ -e "OPENAI_API_KEY" \ -e "APPLICATION_AGENT_PW" \ ro6in/application-agent ```` Or without LinkedIn features ````shell sudo OPENAI_API_KEY=$OPENAI_API_KEY \ APPLICATION_AGENT_PW=$APPLICATION_AGENT_PW \ docker run -t -i \ -e "OPENAI_API_KEY" \ -e "APPLICATION_AGENT_PW" \ ro6in/application-agent ```` ### Run uv locally First make sure all the [Prerequisites](#Prerequisites) (general and additional for running local) are taken care of. Then the app can be started with. ````shell uv run --with streamlit streamlit run main.py ```` ## How to use 1. After the webapp startup finished, type in your `APPLICATION_AGENT_PW` to enable the functionalities. 2. If you want to proceed you can now enable the main menu on the sidebar, this switch also gives you the option to always go back to this start page. 3. Next you should decide if you want to continue with one of the exitsting user profiles or want to create a new one. 4. After you switch to your user, you can decide on the next interaction interfaces, either those with the Agent workflows or to manuelly edit your Curriculum Vitae or personal data. * Agent workflows (for all agents the `OPENAI_API_KEY` is required) - **Cover Letter Agent**, agent to extract information from a job ad and write a cover letter based on that and your Curriculum Vitae. Therefore, you first need to provide your CV data either manuelly or start with the **LinkedIn CV extraction**. - **LinkedIn CV extraction**, agent to extract your Curriculum Vitae from your LinkedIn profile, `LINKEDINCOOKIE` is required for this feature. - **Job search Agent**, agent to search in the database of the German **Agentur für Arbeit** for job postings. * Manuell CV data editing - **CV style**, change the style and color of the CV and view it online. - **Upload data**, upload the `info.json` and `cv.json` file you might have stored locally form previous use of the app. - **Chronological items**, change or add chronological items like proffessional experience item or an educational background item. - **Project**, change or add projects you want to have in your CV. - **Skill**, change or add skills from your CV. - **Languages**, change or add language skills. * Edit personal data - **Personal data**, edit the personal data. ### General - On the sidebar there is an Error monitoring, after an error is caught it indicates it until logging files are deleted with the corresponding button. - Logging files can be downloaded as `.zip` file, but the download will be only prepared on demand to enhance performance. - Also, the personal data file `info.json` and `cv.json` can be downloaded as zip to store tha data local. - After Agents runs the tooken costs are visualized on the sidebar. - To prevent any agent to override your data, agents always work with the hidden `agent_user`. In the manuall editing interfaces you can switch between your personal user storage and the agents storage. By switching to the Agent interfaces your personal data is always copied to the `agent_user`, and in the agent interfaces you can always reload your personal data to the `agent_user`. ## Technical description ### Job ad extraction agent - For the Job extraction three different options to pass a job ad are possible. 1. Provide a ULR to a job posting, it has to be freely accessable, so also without login or similar. 2. Paste a full job ad as plain text. 3. Provide a job ad via LinkedIn, not as ULR but with the identification number, you can find the number in the URL or in the post itself. `LINKEDINCOOKIE` is required for this feature. - First the agent trys to load the job ad from the provided source. (web scraping for the URL, plain text doesn't need further loading, and for LinkedIn the linkedinscraper is utelized) - The content of the loaded job add is validated, the LLM judged if the content really is a job ad, otherwise the workflow will be terminated. - If the job ad was profied via URL the LLM is used to validate if the page titel from the web scraper is already the job title. - If there is yet no valid job title the LLm is used to find it in the job ad content. - Now the LLM should identfy the contact person to write the cover letter to, this also includes extraction of the company name and address. - The language is extracted from the job ads main content, sometimes mayor parts of the scraped webside are not the actual job ad and are in a different language. The purpose is that the cover letter should be in the same language as the job ad. - The task and requirements/qualifications are extraced from the content. - All the extracted data is store as json. - After execution, it's possible to review and manually change the job ad information. ```mermaid --- config: flowchart: curve: linear --- graph TD; __start__([

__start__

]):::first load_job_ad_web(load_job_ad_web) load_job_ad_plain_text(load_job_ad_plain_text) load_job_ad_linkedin(load_job_ad_linkedin) validate_job_ad_content(validate_job_ad_content) validate_meta_title(validate_meta_title) extract_job_title_from_content(extract_job_title_from_content) extract_recipient_for_cover_letter(extract_recipient_for_cover_letter) extract_language_from_job_ad(extract_language_from_job_ad) extract_relevant_info_from_job_ad(extract_relevant_info_from_job_ad) store_job_ad(store job ad) __end__([

__end__

]):::last __start__ -.-> __end__; __start__ -.-> load_job_ad_linkedin; __start__ -.-> load_job_ad_plain_text; __start__ -.-> load_job_ad_web; extract_job_title_from_content --> extract_recipient_for_cover_letter; extract_language_from_job_ad --> extract_relevant_info_from_job_ad; extract_recipient_for_cover_letter --> extract_language_from_job_ad; extract_relevant_info_from_job_ad --> store_job_ad; load_job_ad_linkedin --> validate_job_ad_content; load_job_ad_plain_text --> validate_job_ad_content; load_job_ad_web --> validate_job_ad_content; validate_job_ad_content -.-> __end__; validate_job_ad_content -.-> extract_job_title_from_content; validate_job_ad_content -.-> validate_meta_title; validate_meta_title -.-> extract_job_title_from_content; validate_meta_title -.-> extract_recipient_for_cover_letter; store_job_ad --> __end__; classDef default fill:#f2f0ff,line-height:1.2 classDef first fill-opacity:0 classDef last fill:#bfb6fc ``` ### Cover letter writing agent - For the Letter agent CV data has to be available, and job ad data have to be loaded previously, or it has to be run together with the Job ad extraction agent. - Best practice to have extensive information in the CV data, much more than one would have for a CV to send for an application - There are to differnt options on how the agent processes the data to write the letter paragraphs. 1. Simple. All the job ad data and relevant CV data is directly provided to the LLM, with to Prompt to write the paragraphs based on tha information. In this case the LLM has to filter, and match relevant information in one step, and also has to write the text. This results in results that sometime tends to be not that thruthful. 2. Adanced like a RAG. Experiences and projects are loaded into a chroma db. Then based on the task and requirements/qualifications the LLM generates a list of queries to search for in that db. With similarity search the queries are searched for in the CV, if similarity is close (below thresshold value), matches from the CV and ad content are created. Then the list of these matches is used to generate the paragraphs instead of raw CV and ad content. The results are much more relaible, less haluzination that a skill was part a professional experience when it actually wasn't. - The agent first generates the letter extensions like opening, closing, subject line and adress field based in the extraced language, company name, contact person, adress and job tite. - The letter paragraphs are generated with one of the schemas, for both LongTerm Memory stored personal user preferences are considered. (Like number of paragraphs...) - Then the cover letter is generated with LaTex. - After that the agent is interrupted and the cover letter is show with a feedback chat below. - If feedback is given the regeneration of the paragraphs is trigged providing the LLM with the feedback message, previous paragraphs and data from the choosen paragraph writing option. - If either `quit` or `exit` is part of the feedback message the feedback loop is terminated, as last step previous feedback messages are used to identify some general applicaple user preference for the paragraphs that are not ad specific, these are store as LongTerm Memory. - After that it is possible edit the elements of the letter manually and regenerate it without the LLM. The letter generation is not a tool meant to fool somebody, therefore a side note is placed with an explaination that it is AI generated, to make it more ethical. ```mermaid --- config: flowchart: curve: linear --- graph TD; __start__([

__start__

]):::first write_cover_letter_paragraphs(write_cover_letter_paragraphs) query_cv_data(query_cv_data) write_cover_letter_paragraphs_with_cv(write_cover_letter_paragraphs_with_cv) write_cover_letter_extensions(write_cover_letter_extensions) generate_cover_letter(generate_cover_letter) feedback_loop(feedback_loop) __end__([

__end__

]):::last __start__ --> write_cover_letter_extensions; feedback_loop -.-> __end__; feedback_loop -.-> write_cover_letter_paragraphs; feedback_loop -.-> write_cover_letter_paragraphs_with_cv; generate_cover_letter --> feedback_loop; query_cv_data --> write_cover_letter_paragraphs_with_cv; write_cover_letter_extensions -.-> query_cv_data; write_cover_letter_extensions -.-> write_cover_letter_paragraphs; write_cover_letter_paragraphs --> generate_cover_letter; write_cover_letter_paragraphs_with_cv --> generate_cover_letter; classDef default fill:#f2f0ff,line-height:1.2 classDef first fill-opacity:0 classDef last fill:#bfb6fc ``` ### LinkedIn CV extract agent - `LINKEDINCOOKIE` is required for this feature. - Provide the URL to your LinkedIn profile. - LinkedInScraper is utelized to scrape the data from the profile, logged in with the `LINKEDINCOOKIE`. Unfortunatly the packe is not well maintained, so the method has some limitations, common troubles are: - Several profession experince items are interpreted by the package as a single on. - For educational background the packes does not provide the dates. - General mixup of the entries in the dictionary of the package. - The agent now has to go through the experience items provided by LinkedInScraper, the full dictionary an entry is passed and newly sorted with the help of the LLM. The cleaning is fine information are passed to the intended keys, no haluzinations of the inforamtion not give or wrong placements. - The clearing the stations of eduaction background is done similar to the experience, but dates are never proived for educational items by the LinkedInScraper so semi random placeholder are created. - After the agents workflow the CV is generated based on the extraced data. ```mermaid --- config: flowchart: curve: linear --- graph TD; __start__([

__start__

]):::first Scrape_LinkedIn_Profile(Scrape LinkedIn Profile) Clean_Experience_Item(Clean Experience Item) Clean_Education_Item(Clean Education Item) __end__([

__end__

]):::last Clean_Experience_Item --> Clean_Education_Item; Scrape_LinkedIn_Profile --> Clean_Experience_Item; __start__ --> Scrape_LinkedIn_Profile; Clean_Education_Item --> __end__; classDef default fill:#f2f0ff,line-height:1.2 classDef first fill-opacity:0 classDef last fill:#bfb6fc ``` ### Job search Agent - In the chat the LLM can be asked to search for jobs. - It's necessary to provide some kind of job title that the LLM can use for a query, also mandetory is specify a city where to search. (optional a search area radius can be mentioned) - The database is from German **Agentur für Arbeit**, in generall are there only jobs in Germany listed and the job inforamtions title and so on are in German, except for jobs were the postings are also in English. Often Jobs like "Data Scientist", "AI Engineer" ... are not translated into German. - The agent first validates the user message. - After succes a tool for retrieving the job postings should be called by the LLM. If the application is run via `uv` a docker MCP server is started with it. The job serach tool is provided by that MCP server, which calls the corresponding API. If the app is already run in the docker container the tool function is used in that container. - An additional tool job scrape the personal job recommendations directly from LinkedIn is currently under development. - The results are processed. - After the agent is done job listings are visualised as a table in the webapp. Unfortunately the database rarely priveds the URL to the original job post, also detailed discriptions are not provided. - Therefore, currently there is no feature to directly start the cover letter agent from an item of the job listing table. But with company and job title it should be easy to search the original job posting and with that do cover letter generation. ```mermaid --- config: flowchart: curve: linear --- graph TD; __start__([

__start__

]):::first input_validation(input validation) Job_search_assistant(Job search assistant) tools(tools) process_results(process results) __end__([

__end__

]):::last Job_search_assistant -.-> __end__; Job_search_assistant -.-> tools; __start__ --> input_validation; input_validation -.-> Job_search_assistant; input_validation -.-> __end__; tools --> process_results; process_results --> __end__; classDef default fill:#f2f0ff,line-height:1.2 classDef first fill-opacity:0 classDef last fill:#bfb6fc ``` # AI Engineering Capstone Case 2: AI Agent for Task Automation ### Objective: Develop an AI agent capable of automating complex workflows such as data analysis, report generation, or customer support. Several complex agent workflows are developed. See [Describtion of the Use Case](#Describtion-of-the-Use-Case) ### Key Tools: LangChain, Python/JavaScript, OpenAI Function Calling, external APIs. Used key tools are LangChain, Python, OpenAI function calling (with self developed MCP), the self developed MCP calls an API ### Steps: 1. Agent Design: - The Agents purpose and capabilities is clearly defined in [Describtion of the Use Case](#Describtion-of-the-Use-Case). 2. Tool Integration: - The Agent is quiped with API calling, web scraping and database queries. [Technical description](#Technical-description) 3. Agent Execution: - One of the developed agents has a LongTerm Memory for user preferences. - Several of the developed agent consist out of consecutive steps with LLM calls to improve the results. 4. Interactive Prototyping: - There is an extensive and complex streamlit webapp, with many different visualization features and embeeded documentation. 5. Evaluation: - The agent is extensivly tested and based on the evaluation several enhancements are implemented. - Also, the cover letter from the agent are already used for application, that also already lead to an interview. 6. Documentation: - This RAEDME is a comprehensive report detailing functionality. Party of this documentation are also embeeded in the corresponding feature of the webapp. - Ethical considerations involve that AI generated content should be labelled. A sidemark label is always on the generated PDF. Alos, all AI handeld personal data is publicly available. # Future features ## Job ad search agent + Job recomendations from LinkedIn, this depends on linkedinscraper package, currently this feature from the package is not working. + Other job databases to get data from, then maybe direct option to generated application documents in one click, if availabe data is sufficient. ## CV change agent + Agent to reduce and rewrite the content of the CV to highlight content that aligns with the job ad and reduces that which not. + Also, a summary at the top of the CV could be generated. + Translation of the CV. ## CV upload agent + Agent that takes a CV as PDF and deconstructs the information into the CvData class so manuell adding all the information in the beginning is not necessary. **Diverse** * Filtering and validating all manuell entry. * Alternative LinkedIn Scraping, using login from the actual scraper and pass the full page content to filter agent. * Docker multicontainer system. Seperation of the features the archive more resiliance (restarting parts), better scalebility for deployment. * Proper login and authentification. * Some fixing of minor bugs and a list of open issues in the corresponding privat repository.

About this project

__start__

__end__

__start__

__end__

__start__

]):::first Scrape_LinkedIn_Profile(Scrape LinkedIn Profile) Clean_Experience_Item(Clean Experience Item) Clean_Education_Item(Clean Education Item) __end__([

__end__

__start__

]):::first input_validation(input validation) Job_search_assistant(Job search assistant) tools(tools) process_results(process results) __end__([

__end__

Job application & search agent

About this project

Job application & search agent

About this project