Cybercriminals compromise domain names to attack the owners or users of the domains directly, or use them for various nefarious endeavors, including phishing, malware distribution, and command and control (C2) operations. A special case of DNS hijacking is called domain shadowing, where attackers stealthily create malicious subdomains under compromised domain names. Shadowed domains do not affect the normal operation of the compromised domains, making it hard for victims to detect them. The inconspicuousness of these subdomains often allows perpetrators to take advantage of the compromised domain’s benign reputation for a long time.
Current threat research-based detection approaches are labor-intensive and slow as they rely on the discovery of malicious campaigns that use shadowed domains before they can look for related domains in various data sets. To address these issues, we designed and implemented an automated pipeline that can detect shadowed domains faster on a large scale for campaigns that are not yet known. Our system processes terabytes of passive DNS logs every day to extract features about candidate shadowed domains. Building on these features, it uses a high-precision machine learning model to identify shadowed domain names. Our model finds hundreds of shadowed domains created daily under dozens of compromised domain names.