Initial commit
This commit is contained in:
224
README.md
224
README.md
@@ -1,3 +1,223 @@
|
||||
# mikrotik-adlist-builder
|
||||
# Mikrotik Adlist Builder
|
||||
|
||||
Tool for building mikrotik adlists for blocking ands and harmful domains.
|
||||
`mikrotik-adlist-builder` is a small Python tool for building MikroTik adlists used to block ads and harmful domains.
|
||||
|
||||
It can download multiple blocklists from remote URLs, read local files, extract valid domain names from different formats, merge them, remove duplicates, and write the final output in a MikroTik-friendly format:
|
||||
|
||||
```text
|
||||
0.0.0.0 example.com
|
||||
0.0.0.0 ads.example.net
|
||||
```
|
||||
|
||||
## Features
|
||||
|
||||
- Supports multiple input sources
|
||||
- Downloads blocklists from `http://` and `https://` URLs
|
||||
- Reads local files from:
|
||||
- relative paths such as `./custom-domains.txt`
|
||||
- absolute paths
|
||||
- `file://` URLs
|
||||
- Supports multiple common blocklist formats:
|
||||
- ABP-style rules such as `||example.com^`
|
||||
- hosts file syntax such as `0.0.0.0 example.com`
|
||||
- plain domain lists such as `example.com`
|
||||
- Removes duplicates automatically
|
||||
- Filters out invalid entries
|
||||
- Writes a merged output file ready for MikroTik adlist import
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.9 or newer
|
||||
|
||||
## Installation
|
||||
|
||||
No external dependencies are required.
|
||||
|
||||
Clone the repository or just save the script locally:
|
||||
|
||||
```bash
|
||||
chmod +x mikrotik-adlist-builder.py
|
||||
```
|
||||
|
||||
You can then run it directly:
|
||||
|
||||
```bash
|
||||
./mikrotik-adlist-builder.py
|
||||
```
|
||||
|
||||
Or with Python:
|
||||
|
||||
```bash
|
||||
python3 mikrotik-adlist-builder.py
|
||||
```
|
||||
|
||||
## Default sources
|
||||
|
||||
The script includes a built-in `DEFAULT_URLS` list. Example:
|
||||
|
||||
```python
|
||||
DEFAULT_URLS = [
|
||||
"https://big.oisd.nl/",
|
||||
"https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts",
|
||||
"./custom-domains.txt",
|
||||
]
|
||||
```
|
||||
|
||||
This means the tool can combine public online blocklists with your own local domain list.
|
||||
|
||||
## Usage
|
||||
|
||||
### Use default sources
|
||||
|
||||
```bash
|
||||
python3 mikrotik-adlist-builder.py
|
||||
```
|
||||
|
||||
This will create:
|
||||
|
||||
```text
|
||||
adlist.txt
|
||||
```
|
||||
|
||||
### Specify custom URLs
|
||||
|
||||
```bash
|
||||
python3 mikrotik-adlist-builder.py \
|
||||
-u https://big.oisd.nl/ \
|
||||
-u https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
|
||||
```
|
||||
|
||||
### Mix remote and local sources
|
||||
|
||||
```bash
|
||||
python3 mikrotik-adlist-builder.py \
|
||||
-u https://big.oisd.nl/ \
|
||||
-u ./custom-domains.txt \
|
||||
-u ./my-extra-list.txt
|
||||
```
|
||||
|
||||
### Use a local file via `file://`
|
||||
|
||||
```bash
|
||||
python3 mikrotik-adlist-builder.py \
|
||||
-u file:///home/user/blocklists/custom.txt
|
||||
```
|
||||
|
||||
### Change output file
|
||||
|
||||
```bash
|
||||
python3 mikrotik-adlist-builder.py \
|
||||
-o mikrotik-adlist.txt
|
||||
```
|
||||
|
||||
### Full example
|
||||
|
||||
```bash
|
||||
python3 mikrotik-adlist-builder.py \
|
||||
-u https://big.oisd.nl/ \
|
||||
-u https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts \
|
||||
-u ./custom-domains.txt \
|
||||
-o mikrotik-adlist.txt
|
||||
```
|
||||
|
||||
## Supported input formats
|
||||
|
||||
### 1. ABP syntax
|
||||
|
||||
Example:
|
||||
|
||||
```text
|
||||
||example.com^
|
||||
||ads.example.net^
|
||||
```
|
||||
|
||||
Some simplified variants are also accepted, for example:
|
||||
|
||||
```text
|
||||
||example.com
|
||||
```
|
||||
|
||||
### 2. Hosts syntax
|
||||
|
||||
Example:
|
||||
|
||||
```text
|
||||
0.0.0.0 example.com
|
||||
127.0.0.1 ads.example.net
|
||||
```
|
||||
|
||||
### 3. Plain domain syntax
|
||||
|
||||
Example:
|
||||
|
||||
```text
|
||||
example.com
|
||||
ads.example.net
|
||||
tracker.example.org
|
||||
```
|
||||
|
||||
## Local custom domain file example
|
||||
|
||||
Example `custom-domains.txt`:
|
||||
|
||||
```text
|
||||
example-bad-site.com
|
||||
ads.example.net
|
||||
tracker.example.org
|
||||
```
|
||||
|
||||
You can also mix in hosts-style entries:
|
||||
|
||||
```text
|
||||
0.0.0.0 bad.example.com
|
||||
127.0.0.1 ads.badsite.net
|
||||
```
|
||||
|
||||
And ABP-style rules:
|
||||
|
||||
```text
|
||||
||tracker.example.org^
|
||||
||ads.example.net^
|
||||
```
|
||||
|
||||
## Output format
|
||||
|
||||
The generated file contains one domain per line in this format:
|
||||
|
||||
```text
|
||||
0.0.0.0 domain.tld
|
||||
```
|
||||
|
||||
Example:
|
||||
|
||||
```text
|
||||
0.0.0.0 ads.example.com
|
||||
0.0.0.0 tracker.example.net
|
||||
0.0.0.0 malware.example.org
|
||||
```
|
||||
|
||||
## Import into MikroTik
|
||||
|
||||
The resulting file is intended to be used as a source for a MikroTik adlist or for further processing before import, depending on your RouterOS version and setup.
|
||||
|
||||
## Notes
|
||||
|
||||
- Relative local paths are resolved against the current working directory from which you run the script.
|
||||
- `file://` paths should normally be absolute.
|
||||
- Duplicate domains are removed automatically.
|
||||
- Invalid lines, comments, whitelist rules, localhost-style entries, IPv6 entries, and malformed domains are ignored.
|
||||
|
||||
## Example output messages
|
||||
|
||||
```text
|
||||
[INFO] Downloading: https://big.oisd.nl/
|
||||
[INFO] Domains found: 123456
|
||||
[INFO] Downloading: ./custom-domains.txt
|
||||
[INFO] Domains found: 25
|
||||
[OK] Output written to: adlist.txt
|
||||
[OK] Total unique domains: 123470
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
Use, modify, and distribute freely as needed.
|
||||
1
custom-domains.txt
Normal file
1
custom-domains.txt
Normal file
@@ -0,0 +1 @@
|
||||
0.0.0.0 ssp.seznam.cz
|
||||
243
mikrotik-adlist-builder.py
Normal file
243
mikrotik-adlist-builder.py
Normal file
@@ -0,0 +1,243 @@
|
||||
#!/usr/bin/env python3
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import gzip
|
||||
import io
|
||||
import re
|
||||
import sys
|
||||
import urllib.request
|
||||
|
||||
from urllib.parse import urlparse, unquote
|
||||
from pathlib import Path
|
||||
from typing import Iterable, Optional, Set
|
||||
|
||||
DEFAULT_URLS = [
|
||||
# Popular blocklists that often use ABP syntax
|
||||
"https://big.oisd.nl/",
|
||||
"https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts",
|
||||
|
||||
# Custom blocklists. These may contain ABP rules,
|
||||
# but mainly plain domain syntax.
|
||||
"./custom-domains.txt",
|
||||
]
|
||||
|
||||
ABP_DOMAIN_RE = re.compile(r"^\|\|([A-Za-z0-9._-]+)\^$")
|
||||
HOSTS_SPLIT_RE = re.compile(r"\s+")
|
||||
VALID_DOMAIN_RE = re.compile(
|
||||
r"^(?=.{1,253}$)(?!-)(?:[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\.)+[a-z0-9-]{2,63}\.?$",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
def download_text(url: str, timeout: int = 30) -> str:
|
||||
parsed = urlparse(url)
|
||||
|
||||
# Local file via file://
|
||||
if parsed.scheme == "file":
|
||||
path = Path(unquote(parsed.path))
|
||||
with open(path, "r", encoding="utf-8", errors="replace") as f:
|
||||
return f.read()
|
||||
|
||||
# Local file without scheme, e.g. ./list.txt or list.txt
|
||||
if parsed.scheme == "":
|
||||
path = Path(url)
|
||||
with open(path, "r", encoding="utf-8", errors="replace") as f:
|
||||
return f.read()
|
||||
|
||||
# HTTP/HTTPS
|
||||
req = urllib.request.Request(
|
||||
url,
|
||||
headers={
|
||||
"User-Agent": "mikrotik-adlist-builder/1.0",
|
||||
"Accept-Encoding": "gzip",
|
||||
},
|
||||
)
|
||||
|
||||
with urllib.request.urlopen(req, timeout=timeout) as response:
|
||||
raw = response.read()
|
||||
encoding = response.headers.get("Content-Encoding", "").lower()
|
||||
|
||||
if encoding == "gzip":
|
||||
raw = gzip.decompress(raw)
|
||||
else:
|
||||
if len(raw) >= 2 and raw[:2] == b"\x1f\x8b":
|
||||
raw = gzip.decompress(raw)
|
||||
|
||||
charset = response.headers.get_content_charset() or "utf-8"
|
||||
return raw.decode(charset, errors="replace")
|
||||
|
||||
def normalize_domain(domain: str) -> Optional[str]:
|
||||
domain = domain.strip().lower().rstrip(".")
|
||||
if not domain:
|
||||
return None
|
||||
|
||||
if domain in {"localhost", "local", "broadcasthost"}:
|
||||
return None
|
||||
|
||||
if "/" in domain or "\\" in domain:
|
||||
return None
|
||||
|
||||
if ":" in domain:
|
||||
# Ignore IPv6, ports, and similar entries
|
||||
return None
|
||||
|
||||
if domain.startswith("*."):
|
||||
domain = domain[2:]
|
||||
|
||||
if not VALID_DOMAIN_RE.match(domain):
|
||||
return None
|
||||
|
||||
return domain
|
||||
|
||||
def extract_from_abp_line(line: str) -> Optional[str]:
|
||||
# Example: ||example.com^
|
||||
m = ABP_DOMAIN_RE.match(line)
|
||||
if m:
|
||||
return normalize_domain(m.group(1))
|
||||
|
||||
# Some variants may omit the trailing ^
|
||||
if line.startswith("||"):
|
||||
candidate = line[2:]
|
||||
for sep in ["^", "/", "$"]:
|
||||
if sep in candidate:
|
||||
candidate = candidate.split(sep, 1)[0]
|
||||
return normalize_domain(candidate)
|
||||
|
||||
return None
|
||||
|
||||
def extract_from_hosts_line(line: str) -> Set[str]:
|
||||
result: Set[str] = set()
|
||||
|
||||
# Remove inline comment
|
||||
line = line.split("#", 1)[0].strip()
|
||||
if not line:
|
||||
return result
|
||||
|
||||
parts = HOSTS_SPLIT_RE.split(line)
|
||||
if len(parts) < 2:
|
||||
return result
|
||||
|
||||
first = parts[0].lower()
|
||||
|
||||
# Common hosts file IP prefixes
|
||||
if first in {"0.0.0.0", "127.0.0.1", "::1", "::", "255.255.255.255"}:
|
||||
for item in parts[1:]:
|
||||
d = normalize_domain(item)
|
||||
if d:
|
||||
result.add(d)
|
||||
|
||||
return result
|
||||
|
||||
def extract_plain_domain(line: str) -> Optional[str]:
|
||||
line = line.strip()
|
||||
if not line:
|
||||
return None
|
||||
|
||||
if line.startswith(("!", "#", "[")):
|
||||
return None
|
||||
|
||||
if line.startswith("@@"):
|
||||
# Ignore whitelist rules
|
||||
return None
|
||||
|
||||
if line.startswith(("||", "|")):
|
||||
return None
|
||||
|
||||
if any(x in line for x in [" ", "\t", "/", "^", "$"]):
|
||||
return None
|
||||
|
||||
return normalize_domain(line)
|
||||
|
||||
def extract_domains(text: str) -> Set[str]:
|
||||
domains: Set[str] = set()
|
||||
|
||||
for raw_line in io.StringIO(text):
|
||||
line = raw_line.strip()
|
||||
|
||||
if not line:
|
||||
continue
|
||||
|
||||
# Skip comments and metadata
|
||||
if line.startswith(("!", "#", "[")):
|
||||
continue
|
||||
|
||||
# Skip ABP whitelist rules
|
||||
if line.startswith("@@"):
|
||||
continue
|
||||
|
||||
# 1) ABP syntax
|
||||
d = extract_from_abp_line(line)
|
||||
if d:
|
||||
domains.add(d)
|
||||
continue
|
||||
|
||||
# 2) hosts file syntax
|
||||
hosts_domains = extract_from_hosts_line(line)
|
||||
if hosts_domains:
|
||||
domains.update(hosts_domains)
|
||||
continue
|
||||
|
||||
# 3) plain domain syntax
|
||||
d = extract_plain_domain(line)
|
||||
if d:
|
||||
domains.add(d)
|
||||
continue
|
||||
|
||||
return domains
|
||||
|
||||
def build_output(urls: Iterable[str], output_file: str) -> int:
|
||||
all_domains: Set[str] = set()
|
||||
|
||||
for url in urls:
|
||||
print(f"[INFO] Downloading: {url}", file=sys.stderr)
|
||||
try:
|
||||
text = download_text(url)
|
||||
domains = extract_domains(text)
|
||||
print(f"[INFO] Domains found: {len(domains)}", file=sys.stderr)
|
||||
all_domains.update(domains)
|
||||
except Exception as exc:
|
||||
print(f"[ERROR] {url}: {exc}", file=sys.stderr)
|
||||
|
||||
sorted_domains = sorted(all_domains)
|
||||
|
||||
with open(output_file, "w", encoding="utf-8", newline="\n") as f:
|
||||
for domain in sorted_domains:
|
||||
f.write(f"0.0.0.0 {domain}\n")
|
||||
|
||||
return len(sorted_domains)
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Download multiple blocklists from URLs and create one MikroTik adlist in '0.0.0.0 domain' format."
|
||||
)
|
||||
parser.add_argument(
|
||||
"-u",
|
||||
"--url",
|
||||
action="append",
|
||||
dest="urls",
|
||||
help="Blocklist URL. Can be used multiple times.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"-o",
|
||||
"--output",
|
||||
default="adlist.txt",
|
||||
help="Output file.",
|
||||
)
|
||||
return parser.parse_args()
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
urls = args.urls or DEFAULT_URLS
|
||||
|
||||
if not urls:
|
||||
print("Error: no URLs were provided. Use -u URL or edit DEFAULT_URLS.", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
count = build_output(urls, args.output)
|
||||
print(f"[OK] Output written to: {args.output}", file=sys.stderr)
|
||||
print(f"[OK] Total unique domains: {count}", file=sys.stderr)
|
||||
return 0
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
Reference in New Issue
Block a user