LM-Kit.NET 2026.5.1

Released: May 6, 2026

2026.5.1 버젼 업데이트

기능

Extended encrypted GGUF support to metadata-only scenarios (LMKit.Cryptography, LMKit.Model, LMKit.Hardware): LM.LoadEncrypted now honors LoadingOptions.LoadTensors = false, mirroring the plaintext metadata-only path - only the metadata block is decrypted, no tensor bytes are read, and the resulting LM exposes architecture, vocabulary, context length, layer count, and other GGUF metadata. Use this for fast catalog inspection or pre-flight checks on protected .lmke containers.
Added MemoryEstimation.FitParameters overload for encrypted containers (LMKit.Hardware): new FitParameters(string encryptedPath, GgufEncryptionScheme scheme, string password, ...) runs the native fit estimator against an encrypted GGUF without ever materializing tensor bytes; the existing FitParameters(LM model, ...) overload also now works on models loaded via LM.LoadEncrypted and reuses the metadata cached at load time, so callers do not need to re-supply the password. Tensor data is never decrypted during estimation.
Added LM.IsEncrypted property (LMKit.Model): true when the instance was loaded via LM.LoadEncrypted. Lets downstream code branch on encryption state without inspecting the file path.
Added LM.DeviceConfiguration.AutoFitToVram property (LMKit.Model): controls whether the model loader automatically retries with progressively fewer GPU layers when the first load attempt fails because the model does not fit in available VRAM. Default is true. When enabled, the runtime walks GpuLayerCount down - placing the remaining layers in system memory - until the model loads or the entire model is on CPU. Set to false to restore the previous behavior of failing loud on insufficient VRAM.
Improved inference speed.
Automatic partial CPU offload on VRAM exhaustion (LMKit.Model): when a model load fails because the model does not fit in the GPU's available VRAM, the loader now automatically retries with progressively fewer GPU layers - placing the remaining layers in system memory - until the model loads or the entire model is on CPU. Replaces the previous behavior where insufficient VRAM produced an immediate exception. The fallback is gated by the new DeviceConfiguration.AutoFitToVram flag (default true); set it to false to restore the previous fail-loud behavior.
Pre-flight context-size sizing for tight VRAM (LMKit.Model, LMKit.TextGeneration): before allocating a new inference context, the runtime now estimates the KV-cache and compute-buffer cost for the requested context size and compares it to the device's currently free VRAM. If the projection exceeds free memory, the context size is reduced up-front, avoiding a doomed allocation attempt. On an actual allocation failure during creation, the runtime additionally retries with progressively smaller context sizes before throwing.
Sharper diagnostics on context-creation failure (LMKit.Exceptions): when the runtime cannot allocate an inference context even after its built-in retries, the thrown RuntimeException now includes the device's free VRAM at failure time and a hint to either set DeviceConfiguration.GpuLayerCount = 0 for CPU-only loading or shrink the requested context size.
Breaking Changes - Removed the experimental LM.DeviceConfiguration.ForceCpuMode property (LMKit.Model): functionally duplicated by setting GpuLayerCount = 0, which routes the entire model and KV cache to system memory. Callers who set ForceCpuMode = true should set GpuLayerCount = 0 instead.

Back to LM-Kit.NET Releases

컴포넌트, 어플리케이션, 애드인, 클라우드 서비스 검색

컴포넌트 카테고리

컴포넌트 타입

컴포넌트 환경

컴포넌트 개발처

1700+ 소프트웨어 컴포넌트를 한 곳에

어플리케이션 카테고리

어플리케이션 타입

어플리케이션 개발처

600+ 소프트웨어 어플리케이션을 한 곳에

애드인 카테고리

애드인 타입

애드인 개발처

250+ 소프트웨어 애드인을 한 곳에

베스트 셀러 브랜드

200+ 개발처 브랜드를 한 곳에

범주별 뉴스

아키텍처별 뉴스

브랜드별 뉴스

24,000+ 뉴스 기사

LM-Kit.NET

LM-Kit.NET 2026.5.1

2026.5.1 버젼 업데이트

기능

공식 공급 업체

한국어 고객 서비스

신뢰의 30년

고객 서비스

내 계정

회사 정보

판매 & 지원