
DeepSeek V3
| System Under Assessment | DeepSeek-Chat V3 | Assessment Date | March 2026 |
|---|---|---|---|
| Vendor / Provider | DeepSeek AI | Report ID | MV-2026-DSV3-001 |
| Decision Domain | General Decision Support (4 domains) | MAAC Instrument | v4.7 |
| Assessment Type | Single System | Assessor | Abdalla Doleh, PhD |
Cognitive Profile Overview
DeepSeek V3 was assessed across all nine MAAC cognitive dimensions using a stratified scenario corpus spanning four decision domains and three complexity tiers (n = 2,195 assessed responses). The model demonstrates exceptional output-oriented performance — Tool Execution (94), Content Quality (93), and Cognitive Load (89) — with adequate performance across memory, complexity, and transfer dimensions. A bounded residual is documented in Hallucination Control (HC Q6, λ=0.315), designated for monitoring rather than disqualification.
The model is assessed for Baseline Deployment in general-purpose decision-support applications. Reliable cognitive sensitivity to task demands is confirmed by tier differentiation (η²=.28; simple M=4.12, moderate M=4.30, complex M=4.48). High-stakes regulated-industry deployment should apply enhanced HC and KT monitoring protocols per the guidance in Section 04.

Composite Cognitive Score
The overall MAAC score reflects composite performance across all nine dimensions, normalized to a 0–100 scale. Scores above 80 meet the MaacVerify threshold for assessment. Dimensional scores are weighted equally unless a domain-specific weighting profile is specified in the scoping document.
Nine-Dimension Cognitive Profile
Each dimension is scored 0–100 using the MAAC v4.7 adjudication instrument. Flags indicate assessment status for each dimension: Strong (≥80), Monitor (60–79), or Flag (<60). Source data: Doleh et al. (2026).

Nine-Dimension Cognitive Profile
Each dimension is scored 0–100 using the MAAC v4.7 adjudication instrument. Flags indicate assessment status for each dimension: Strong (≥80), Monitor (60–79), or Flag (<60). Source data: Doleh et al. (2026).
| # | Dimension | Score Profile | Score | Flag | |
|---|---|---|---|---|---|
| 01 | Cognitive Load Performance under sustained load | 89 | / 100 | Strong | |
| 02 | Tool Execution Analytical tool & resource coordination | 94 | / 100 | Strong | |
| 03 | Content Quality Coherence, richness, domain compliance | 93 | / 100 | Strong | |
| 04 | Memory Integration Context across turns & long inputs | 78 | / 100 | Monitor | |
| 05 | Complexity Handling Multi-step decomposition & solution quality | 79 | / 100 | Monitor | |
| 06 | Hallucination Control Calibrated uncertainty & factual restraint | 78 | / 100 | Monitor | |
| 07 | Knowledge Transfer Cross-domain & novel problem application | 75 | / 100 | Monitor | |
| 08 | Processing Efficiency Cognitive economy relative to output quality | 76 | / 100 | Monitor | |
| 09 | Process-Outcome Alignment Behavioral consistency between process & output | 87 | / 100 | Strong | |

Application Suitability Assessment
Based on the nine-dimensional cognitive profile, the following deployment guidance applies. This guidance is derived from dimensional scores and domain-specific scenario performance, not subjective analysis.
- General-purpose analytical and planning decision support
- Complex multi-step reasoning (CH: 79, strong tier differentiation)
- Content generation requiring high accuracy and coherence (CQ: 93)
- High-volume analytical workflows where load management is critical (CL: 89)
- Unmonitored clinical decision support — HC residual requires validation
- Autonomous legal document drafting without human review
- Mission-critical simple-task knowledge transfer (KT simple-tier M=3.14)
- Hallucination Control — verification protocols on high-stakes factual claims
- Knowledge Transfer — monitor simple-tier generalization
- Memory Integration — extended context regression testing
- Processing Efficiency — throughput-quality tradeoff at scale
- Model update or architecture change
- Prompt revision or system instruction change
- Drift exceeding ±2.5% on any monitored dimension
- Annual reassessment cadence (recommended)
How This Assessment Was Conducted
This assessment applied the MAAC framework (Doleh et al., 2026) using the validated Study 3 adversarial corpus. The MAAC v4.7 methodology is peer-reviewed and publicly available.

Official Assessment Record
This report confirms that DeepSeek V3 (DeepSeek AI) was assessed using the Multi-Dimensional Assessment for AI Cognition (MAAC) framework version 4.7 against a 4,238-scenario complexity-validated decision corpus across 4 cognitive domains and 3 complexity tiers.
The assessed system achieved an overall MAAC composite score of 83/100, reflecting a mean dimensional score of 4.316 on the 0–100 adjudicated scale across 2,195 assessed responses. Dimensional assessment status is documented in Section 03 of this report.
This report may be used for procurement, regulatory documentation, and board-level AI governance purposes. It reflects performance at the time of assessment against the defined corpus and domain. Ongoing validity requires adherence to the reassessment triggers documented in Section 04.
Without an ongoing drift monitoring arrangement, MaacVerify makes no representations regarding continued performance after the assessment date.
| Report ID | MV-2026-DSV3-001 | Assessment Date | March 2026 |
|---|---|---|---|
| System | DeepSeek-Chat V3 | Domain | General Decision Support |
| Corpus Reference | 10.5281/zenodo.19776346 · v1 | MAAC Instrument | v4.7 |

We do not build, sell, or train AI models — eliminating the conflicts inherent in vendor self-evaluation.