Technical Documentation for EVE
Legal name for the model provider: Pi School S.R.L
Model name: EVE-instruct
Release date: [13 February 2026]
Model dependencies: Mistral-Small-3.2-24B-Instruct-2506
Input modalities:
- Text: input length 128,000 tokens
Output modalities:
- Text*:* maximum output length 128,000 tokens
Total model size: 24B
Distribution channels:
- Open-source repository on HuggingFace
- GUI and API via private hosting
License: Apache License 2.0
Acceptable Use Policy: Users should employ the system for Earth Observation and Earth Science related tasks. Utilization of the system for adjacent domains such as Mathematics, Physics, Computer Science, and Environmental Engineering is also acceptable, provided these applications support or relate to Earth Science objectives. The system is not intended for use cases outside the scope described above. Users are expected to verify critical information through primary sources and exercise appropriate judgment when applying outputs to decision-making processes.
Intended uses: EVE is designed to support users in a variety of natural language processing tasks, easing access to and processing of domain-specific textual information. Users can leverage the system to:
- Research methodologies, datasets, formulas, and scientific tables
- Perform comparisons and analyses across different topics within the domain
- Query the system about scientific content, environmental policies, satellite missions, and observational techniques
- Summarize and synthesize information from Earth Science literature
- Understand complex geospatial concepts and remote sensing principles
- Explore climate science, atmospheric studies, oceanography, and related disciplines
Intended Purpose: The system, EVE (Earth Virtual Expert), is a specialized Large Language Model (LLM) designed for the Earth Science and Earth Observation domains. The model is intended to serve a diverse user base, including students, researchers, policy makers, and professionals with an interest in these fields. EVE is capable of assisting users with a range of tasks, including question answering, summarization, information retrieval, and domain-specific analysis.
Type and nature of AI systems in which the general-purpose AI model can be integrated: EVE (Earth Virtual Expert) is intended to be integrated by downstream providers as a conversational assistant and domain-specific research tool for Earth Observation and Earth Science use cases. It can be used in:
- Chat interfaces for research support, document summarisation, and question answering.
- Decision-support tools that help human experts explore EO/ES literature and datasets, where final decisions remain with human users.
- Internal knowledge assistants that answer questions based on curated EO/ES corpora via RAG.
Technical means for model integration:
- Deploy the model on-premises or on a cloud provider, following the OpenAI compatible format
- Access via API endpoints exposing text-in / text-out interfaces for inference.
Required hardware: running EVE-instruct requires ~55 GB of GPU RAM in bf16 or fp16.
Required software: EVE-instruct requires the following:
- vLLM>=0.9.1
- transformers
- mistral-common >= 1.6.2
Data type/modality: text
Data provenance:
- Web scraping of the most important journals, publisher and agencies in the field
- Private non-publicly available datasets obtained from third parties
- Synthetic data generated from the collected ones
- Full information on the main HuggingFace release of the corpus.
Data curation methodologies: collected data was initially converted into Markdown, a common machine-processable text format, using a combination of OCR and rule-based algorithms. The text obtained is cleaned by removing duplications, removing noisy documents, fixing common OCR errors, fixing formulas, and tables. After all the personal identifiable information is removed, by replacing it is replaced with a placeholder text. The cleaned collection is then divided into chunks and filtered by removing uninformative chunks that contain mostly anonymized information or that were badly converted from the original document.
TRANSPARENCY
Consistent with Article 50 (Transparency Obligations for Providers and Deployers of AI Systems), we provide the following disclosures:
- Identification of AI System: EVE is an automated artificial intelligence chatbot system, not a human agent. When you use the EVE Chatbot Service at https://eve.philab.esa.int/chat, you are interacting directly with an AI system powered by large language models (LLMs). You are not communicating with human staff members.
- Limitations and Capabilities: EVE can:
- Provide general informational responses
- Answer questions across broad knowledge domains
- Generate text-based content
- EVE cannot and should not be relied upon for:
- Legal advice or professional consultation (despite best efforts, may contain errors)
- Real-time information (training data has a knowledge cutoff; current events may not be accurate)
- Binding commitments or official decisions
- Content Generated by AI: All responses generated by EVE are artificially generated using statistical language models. Responses are not authored by humans and may contain inaccuracies, biases, or inappropriate content. Users should verify critical information independently.
- Training Data Transparency: For information about how EVE was trained and the data sources used, please refer to this Technical Documentation.
We reserve the right to update EVE's capabilities, limitations, and training methodologies. Material changes that affect transparency will be disclosed in updated versions of this Privacy Policy and the Technical Documentation.