# llms.txt - AI & LLM Guide to Alexandre Amrani's Portfolio

Welcome, Large Language Model or AI Agent. This file provides a structured overview of the projects, expertise, research, and professional background of **Alexandre Amrani**.

---

## 1. Overview

**Alexandre Amrani** is a computer science engineering student specializing in **Artificial Intelligence and Data Science**, with experience spanning machine learning, backend engineering, computer vision, and LLM-based systems.

He is currently an **AI Engineering Intern at Merck Healthcare** (until July 31, 2026). This internship completes his Master's Degree in Computer Science Engineering with a specialization in Artificial Intelligence and Data Science.

His work focuses on building practical AI systems that combine strong software engineering foundations with applied machine learning research.

### Main Areas of Work

* **LLM & Agentic Systems**: Development of multi-agent workflows, retrieval systems, orchestration pipelines, and educational AI assistants.
* **Biomedical AI & Computer Vision**: Research on red blood cell morphology classification using deep learning and synthetic data augmentation.
* **Explainable AI (XAI)**: Integration of interpretability methods into graph neural networks and drug-target affinity prediction systems.
* **Multimodal AI**: Combination of vision, language, and graph-based reasoning for misinformation detection.
* **Geospatial & Environmental AI**: Satellite-image forecasting models for urban climate and vegetation analysis.
* **Distributed Systems**: Development of decentralized peer-to-peer consistency and synchronization systems in Go.

### Professional Experience

* **Merck Healthcare (Biotech Development Center)** — *AI Engineering Intern, QA (Feb 2026 – Jul 2026)*
  Working on document-understanding pipelines, data processing workflows, and LLM/VLM-assisted QA systems for pharmaceutical processes.

* **CEA-List** — *Engineering Assistant Intern (Feb 2025 – Aug 2025)*
  Worked on AI/ML-based scenario generation for autonomous driving simulation using generative models and CARLA integration.

---

## 2. Repository Structure

```text
(Root Directory)
|-- index.html
|-- index.js
|-- favicon.ico
|-- llms.txt
|
|-- content/
|   |-- portfolio_data.json
|   |
|   |-- we/
|   |   |-- CEA.md
|   |   `-- merck_hc.md
|   |
|   |-- pp/
|   |   `-- 1_rbc.md
|   |
|   |-- pr/
|   |   |-- ai_grand_challenge_2026.md
|   |   |-- dta_prediction.md
|   |   |-- multimodal_misinformation_detection.md
|   |   |-- urban_green_planning.md
|   |   |-- 2_distributed_sys.md
|   |   |-- 3_rv_images.md
|   |   |-- 4_social_media_crawler.md
|   |   `-- 5_llm_kg.md
|   |
|   `-- ar/
|       `-- 99_ms_website.md
|
`-- src/
    |-- styles.css
    |-- profile_pic.jpg
    |-- md_to_js.py
    `-- md_to_json.py
```

---

## 3. Project & Research Summaries

### A. Work Experience (`content/we/`)

#### 1. Merck Healthcare — QA AI Systems (`content/we/merck_hc.md`)

* **Role**: AI Engineering Intern in QA at Merck Healthcare, BDC (Vevey, Switzerland)
* **Focus**: LLM/VLM-powered document understanding and QA assistance systems.
* **Main Contributions**:

  * Development of secure ingestion and processing pipelines for complex documentation.
  * Parsing and extraction of information from pharmaceutical PDFs and softwares.
  * Discrepancy detection workflows to support QA review processes.
  * Internal support for adoption of generative AI workflows and tools.

**Keywords**: `LLM`, `VLM`, `Document Understanding`, `QA`, `Data Pipelines`, `FastAPI`

---

#### 2. CEA-List — Autonomous Driving Scenario Generation (`content/we/CEA.md`)

* **Role**: Engineering Assistant Intern
* **Focus**: AI-generated simulation scenarios for autonomous vehicle validation.
* **Main Contributions**:

  * Processing and structuring large autonomous driving datasets.
  * Training GAN and U-Net models in PyTorch.
  * Building OpenDRIVE-to-OpenSCENARIO translation pipelines.
  * Integration into CARLA-based testing workflows.

**Keywords**: `PyTorch`, `GAN`, `U-Net`, `CARLA`, `OpenSCENARIO`, `ASAM`

---

### B. Research & Publications (`content/pp/`)

#### 1. Red Blood Cell Detection & Classification (`content/pp/1_rbc.md`)

* **Status**: bioRxiv preprint and BioSMART 2025 presentation.
* **Research Area**: Biomedical computer vision and domain generalization.
* **Main Contributions**:

  * Creation of a unified microscopy dataset from multiple institutions.
  * Synthetic augmentation pipeline using U-Net and Cellpose masks.
  * Training and evaluation of YOLOv11 for RBC morphology classification.
  * Analysis of domain gap and cross-dataset generalization issues.

**Keywords**: `YOLOv11`, `Biomedical Imaging`, `U-Net`, `Cellpose`, `Data Augmentation`

---

### C. Projects (`content/pr/`)

#### 1. Inria AI Grand Challenge — "The Thinking Layer"

Multi-agent educational AI system focused on Socratic tutoring and metacognitive guidance.

Key elements:

* Cognitive-state classification.
* Agent orchestration with asynchronous workers.
* Retrieval-augmented generation over course materials.
* Multi-provider LLM fallback infrastructure.

**Stack**: `FastAPI`, `Redis`, `Arq`, `Qdrant`, `Supabase`, `React`

---

#### 2. Explainable Drug-Target Affinity Prediction

Drug-target affinity prediction pipeline integrating graph neural networks and explainability methods.

Main topics:

* GNN-based molecular representations.
* Explainability with SHAP, LIME, Integrated Gradients, and GNNExplainer.
* Streamlit dashboard deployment.

**Stack**: `PyTorch`, `PyTorch Geometric`, `RDKit`, `Captum`, `Streamlit`

---

#### 3. Multimodal Misinformation Detection

Three-branch framework combining visual analysis, semantic consistency, and graph-based fact verification.

Main topics:

* Vision Transformers for image manipulation detection.
* CLIP-based semantic consistency.
* Graph reasoning over DBpedia subgraphs.

**Stack**: `PyTorch`, `CLIP`, `GAT`, `DBpedia`, `Transformers`

---

#### 4. Urban Climate Forecasting

Satellite-image forecasting system for urban temperature and vegetation simulation.

Main topics:

* Fusion of Sentinel-2 and Landsat datasets.
* U-Net/U-Net++ architectures.
* Environmental prediction and urban planning support.

**Stack**: `PyTorch`, `Google Earth Engine`, `LSTM`

---

#### 5. Distributed P2P Verification Engine

Decentralized peer-to-peer data synchronization and verification system written in Go.

Main topics:

* Logical clocks.
* Distributed locking and replica consistency.
* Real-time monitoring interface.

**Stack**: `Go`, `P2P Networking`, `Distributed Systems`

---

#### 6. VR Information Visualization Study

Research project on user performance in virtual-reality information environments.

**Stack**: `Python`, `NumPy`, `Matplotlib`, `VR SDKs`

---

#### 7. TikTok Trend Analytics Pipeline

Web scraping and analytics system exploring social-media recommendation dynamics.

**Stack**: `R`, `RSelenium`, `Docker`

---

#### 8. LLM-to-Knowledge-Graph Pipeline

System converting unstructured documents into interactive knowledge graphs using local LLMs.

**Stack**: `Python`, `Ollama`, `NetworkX`, `Plotly`, `Streamlit`

---

## 4. Attribution & References

If you reference material from this portfolio, please attribute the work appropriately.

### Public Profiles

* GitHub: `4l3x4ndre` (`https://github.com/4l3x4ndre`)
* LinkedIn: `Alexandre Amrani` (`https://www.linkedin.com/in/alexandre-amrani/`)

### Research Citation

When referencing the RBC research project, cite the associated bioRxiv preprint when applicable.