avwx.service
Report Source Services
AVWX fetches the raw weather reports from third-party services via REST API calls or file downloads. We use Service objects to handle the request and extraction for us.
Basic Module Use
METARs and TAFs are the most widely-supported report types, so an effort has been made to localize some of these services to a regional source. The get_service function was introduced to determine the best service for a given station.
# Fetch Australian reports
station = 'YWOL'
country = 'AU' # can source from avwx.Station.country
# Get the station's preferred service and initialize to fetch METARs
service = avwx.service.get_service(station, country)('metar')
# service is now avwx.service.AUBOM init'd to fetch METARs
# Fetch the current METAR
report = service.fetch(station)
Other report types require specific service classes which are found in their respective submodules. However, you can normally let the report type classes handle these services for you.
Adding a New Service
If the existing services are not supplying the report(s) you need, adding a new service is easy. First, you'll need to determine if your source can be scraped or you need to download a file.
ScrapeService
For web scraping sources, you'll need to do the following things:
- Add the base URL and method (if not
"GET") - Implement the
ScrapeService._make_urlto return the source URL and query parameters - Implement the
ScrapeService._extractfunction to return just the report string (starting at the station ID) from the response
Let's look at the MAC service as an example:
class MAC(StationScrape):
"""Requests data from Meteorologia Aeronautica Civil for Columbian stations"""
_url = "http://meteorologia.aerocivil.gov.co/expert_text_query/parse"
method = "POST"
def _make_url(self, station: str) -> tuple[str, dict]:
"""Returns a formatted URL and parameters"""
return self._url, {"query": f"{self.report_type} {station}"}
def _extract(self, raw: str, station: str) -> str:
"""Extracts the report message using string finding"""
return self._simple_extract(raw, f"{station.upper()} ", "=")
Our URL and query parameters are returned using _make_url so fetch knows how to request the report. The result of this query is given to _extract which returns the report or list of reports.
Once your service is created, it can optionally be added to avwx.service.scrape.PREFERRED if the service covers all stations with a known ICAO prefix or avwx.service.scrape.BY_COUNTRY if the service covers all stations in a single country. This is how avwx.service.get_service determines the preferred service. For example, the MAC service is preferred over NOAA for all ICAOs starting with "SK" while AUBOM is better for all Australian stations.
FileService
For file-based sources, you'll need to do the following things:
- Add the base URL and valid report types
- Implement the
FileService._urlsto iterate through source URLs - Implement the
FileService._extractfunction to return just the report string (starting at the station ID) from the response
Let's look at the NOAA_NBM service as an example:
class NOAA_NBM(FileService):
"""Requests forecast data from NOAA NBM FTP servers"""
_url = "https://nomads.ncep.noaa.gov/pub/data/nccf/com/blend/prod/blend.{}/{}/text/blend_{}tx.t{}z"
_valid_types = ("nbh", "nbs", "nbe")
@property
def _urls(self) -> Iterator[str]:
"""Iterates through hourly updates no older than two days"""
date = dt.datetime.now(tz=dt.timezone.utc)
cutoff = date - dt.timedelta(days=1)
while date > cutoff:
timestamp = date.strftime(r"%Y%m%d")
hour = str(date.hour).zfill(2)
yield self.url.format(timestamp, hour, self.report_type, hour)
date -= dt.timedelta(hours=1)
def _extract(self, station: str, source: TextIO) -> Optional[str]:
"""Returns report pulled from the saved file"""
start = station + " "
end = self.report_type.upper() + " GUIDANCE"
txt = source.read()
txt = txt[txt.find(start) :]
txt = txt[: txt.find(end, 30)]
lines = []
for line in txt.split("\n"):
if "CLIMO" not in line:
line = line.strip()
if not line:
break
lines.append(line)
return "\n".join(lines) or None
In this example, we iterate through _urls looking for the most recent published file. URL iterators should always have a lower bound to stop iteration so the service can return a null response.
Once the file is downloaded, the requested station and file-like object are passed to the _extract method to find and return the report from the file. This method will not be called if the file doesn't exist.
1""".. include:: ../../docs/service.md""" 2 3from avwx.service.base import Service 4from avwx.service.files import NoaaGfs, NoaaNbm 5from avwx.service.scrape import ( 6 Amo, 7 Aubom, 8 Avt, 9 # FaaNotam, 10 Mac, 11 Nam, 12 Noaa, 13 Olbs, 14 get_service, 15) 16 17__all__ = ( 18 "get_service", 19 "Noaa", 20 "Amo", 21 "Aubom", 22 "Avt", 23 "Mac", 24 "Nam", 25 "Olbs", 26 # "FaaNotam", 27 "NoaaGfs", 28 "NoaaNbm", 29 "Service", 30)
625def get_service(station: str, country_code: str) -> ScrapeService: 626 """Return the preferred scrape service for a given station. 627 628 ```python 629 # Fetch Australian reports 630 station = "YWOL" 631 country = "AU" # can source from avwx.Station.country 632 # Get the station's preferred service and initialize to fetch METARs 633 service = avwx.service.get_service(station, country)("metar") 634 # service is now avwx.service.Aubom init'd to fetch METARs 635 # Fetch the current METAR 636 report = service.fetch(station) 637 ``` 638 """ 639 with suppress(KeyError): 640 return PREFERRED[station[:2]] # type: ignore 641 return BY_COUNTRY.get(country_code, Noaa) # type: ignore
Return the preferred scrape service for a given station.
# Fetch Australian reports
station = "YWOL"
country = "AU" # can source from avwx.Station.country
# Get the station's preferred service and initialize to fetch METARs
service = avwx.service.get_service(station, country)("metar")
# service is now avwx.service.Aubom init'd to fetch METARs
# Fetch the current METAR
report = service.fetch(station)
223class Amo(StationScrape): 224 """Request data from AMO KMA for Korean stations.""" 225 226 _url = "http://amoapi.kma.go.kr/amoApi/{}" 227 default_timeout = 60 228 229 def _make_url(self, station: str) -> tuple[str, dict]: 230 """Return a formatted URL and parameters.""" 231 return self._url.format(self.report_type), {"icao": station} 232 233 def _extract(self, raw: str, station: str) -> str: # noqa: ARG002 234 """Extract the report message from XML response.""" 235 resp = parsexml(raw) 236 try: 237 report = resp["response"]["body"]["items"]["item"][f"{self.report_type.lower()}Msg"] 238 except KeyError as key_error: 239 raise self._make_err(raw) from key_error 240 if not report: 241 msg = "The station might not exist" 242 raise self._make_err(msg) 243 # Replace line breaks 244 report = report.replace("\n", "") 245 # Remove excess leading and trailing data 246 for item in (self.report_type.upper(), "SPECI"): 247 if report.startswith(f"{item} "): 248 report = report[len(item) + 1 :] 249 report = report.rstrip("=") 250 # Make every element single-spaced and stripped 251 return " ".join(report.split())
Request data from AMO KMA for Korean stations.
274class Aubom(StationScrape): 275 """Request data from the Australian Bureau of Meteorology.""" 276 277 _url = "http://www.bom.gov.au/aviation/php/process.php" 278 method = "POST" 279 280 @staticmethod 281 def _make_headers() -> dict: 282 """Return request headers.""" 283 return { 284 "Content-Type": "application/x-www-form-urlencoded", 285 "Accept": "*/*", 286 "Accept-Language": "en-us", 287 "Accept-Encoding": "gzip, deflate", 288 "Host": "www.bom.gov.au", 289 "Origin": "http://www.bom.gov.au", 290 "User-Agent": secrets.choice(_USER_AGENTS), 291 "Connection": "keep-alive", 292 } 293 294 def _post_data(self, station: str) -> dict: 295 """Return the POST form.""" 296 return {"keyword": station, "type": "search", "page": "TAF"} 297 298 def _extract(self, raw: str, station: str) -> str: # noqa: ARG002 299 """Extract the reports from HTML response.""" 300 index = 1 if self.report_type == "taf" else 2 301 try: 302 report = raw.split("<p")[index] 303 report = report[report.find(">") + 1 :] 304 except IndexError as index_error: 305 msg = "The station might not exist" 306 raise self._make_err(msg) from index_error 307 if report.startswith("<"): 308 return "" 309 report = report[: report.find("</p>")] 310 return report.replace("<br />", " ")
Request data from the Australian Bureau of Meteorology.
375class Avt(StationScrape): 376 """Request data from AVT/XiamenAir for China. 377 NOTE: This should be replaced later with a gov+https source. 378 """ 379 380 _url = "http://www.avt7.com/Home/AirportMetarInfo?airport4Code=" 381 382 def _make_url(self, station: str) -> tuple[str, dict]: 383 """Return a formatted URL and empty parameters.""" 384 return self._url + station, {} 385 386 def _extract(self, raw: str, station: str) -> str: # noqa: ARG002 387 """Extract the reports from HTML response.""" 388 try: 389 data = json.loads(raw) 390 key = f"{self.report_type.lower()}ContentList" 391 text: str = data[key]["rows"][0]["content"] 392 except (TypeError, json.decoder.JSONDecodeError, KeyError, IndexError): 393 return "" 394 else: 395 return text
Request data from AVT/XiamenAir for China. NOTE: This should be replaced later with a gov+https source.
254class Mac(StationScrape): 255 """Request data from Meteorologia Aeronautica Civil for Columbian stations.""" 256 257 _url = "https://meteorologia.aerocivil.gov.co/expert_text_query/parse" 258 method = "POST" 259 260 @staticmethod 261 def _make_headers() -> dict: 262 """Return request headers.""" 263 return {"X-Requested-With": "XMLHttpRequest"} 264 265 def _post_data(self, station: str) -> dict: 266 """Return the POST form/data payload.""" 267 return {"query": f"{self.report_type} {station}"} 268 269 def _extract(self, raw: str, station: str) -> str: 270 """Extract the report message using string finding.""" 271 return self._simple_extract(raw, f"{station.upper()} ", "=")
Request data from Meteorologia Aeronautica Civil for Columbian stations.
356class Nam(StationScrape): 357 """Request data from NorthAviMet for North Atlantic and Nordic countries.""" 358 359 _url = "https://www.northavimet.com/NamConWS/rest/opmet/command/0/" 360 361 def _make_url(self, station: str) -> tuple[str, dict]: 362 """Return a formatted URL and empty parameters.""" 363 return self._url + station, {} 364 365 def _extract(self, raw: str, station: str) -> str: 366 """Extract the reports from HTML response.""" 367 starts = [f">{self.report_type.upper()} <", f">{station.upper()}<", "top'>"] 368 report = self._simple_extract(raw, starts, "=") 369 index = report.rfind(">") 370 if index > -1: 371 report = report[index + 1 :] 372 return f"{station} {report.strip()}"
Request data from NorthAviMet for North Atlantic and Nordic countries.
313class Olbs(StationScrape): 314 """Request data from India OLBS flight briefing.""" 315 316 # _url = "https://olbs.amsschennai.gov.in/nsweb/FlightBriefing/showopmetquery.php" 317 # method = "POST" 318 319 # Temp redirect 320 _url = "https://avbrief3.el.r.appspot.com/" 321 322 def _make_url(self, station: str) -> tuple[str, dict]: 323 """Return a formatted URL and empty parameters.""" 324 return self._url, {"icao": station} 325 326 def _post_data(self, station: str) -> dict: 327 """Return the POST form.""" 328 # Can set icaos to "V*" to return all results 329 return {"icaos": station, "type": self.report_type} 330 331 @staticmethod 332 def _make_headers() -> dict: 333 """Return request headers.""" 334 return { 335 # "Content-Type": "application/x-www-form-urlencoded", 336 # "Accept": "text/html, */*; q=0.01", 337 # "Accept-Language": "en-us", 338 "Accept-Encoding": "gzip, deflate, br", 339 # "Host": "olbs.amsschennai.gov.in", 340 "User-Agent": secrets.choice(_USER_AGENTS), 341 "Connection": "keep-alive", 342 # "Referer": "https://olbs.amsschennai.gov.in/nsweb/FlightBriefing/", 343 # "X-Requested-With": "XMLHttpRequest", 344 "Accept-Language": "en-US,en;q=0.9", 345 "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", 346 "Referer": "https://avbrief3.el.r.appspot.com/", 347 "Host": "avbrief3.el.r.appspot.com", 348 } 349 350 def _extract(self, raw: str, station: str) -> str: 351 """Extract the reports from HTML response.""" 352 # start = raw.find(f"{self.report_type.upper()} {station} ") 353 return self._simple_extract(raw, [f">{self.report_type.upper()}</div>", station], ["=", "<"])
Request data from India OLBS flight briefing.
241class NoaaGfs(NoaaForecast): 242 """Request forecast data from NOAA GFS FTP servers.""" 243 244 _url = "https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfsmos.{}/mdl_gfs{}.t{}z" 245 _valid_types = ("mav", "mex") 246 247 _cycles: ClassVar[dict[str, tuple[int, ...]]] = {"mav": (0, 6, 12, 18), "mex": (0, 12)} 248 249 @property 250 def _urls(self) -> Iterator[str]: 251 """Iterate through update cycles no older than two days.""" 252 warnings.warn( 253 "GFS fetch has been deprecated due to NOAA retiring the format. Migrate to NBM for similar data", 254 DeprecationWarning, 255 stacklevel=2, 256 ) 257 now = dt.datetime.now(tz=dt.timezone.utc) 258 date = dt.datetime.now(tz=dt.timezone.utc) 259 cutoff = date - dt.timedelta(days=1) 260 while date > cutoff: 261 for cycle in reversed(self._cycles[self.report_type]): 262 date = date.replace(hour=cycle) 263 if date > now: 264 continue 265 timestamp = date.strftime(r"%Y%m%d") 266 hour = str(date.hour).zfill(2) 267 yield self._url.format(timestamp, self.report_type, hour) 268 date -= dt.timedelta(hours=1) 269 270 def _index_target(self, station: str) -> tuple[str, str]: 271 return f"{station} GFS", f"{self.report_type.upper()} GUIDANCE"
Request forecast data from NOAA GFS FTP servers.
220class NoaaNbm(NoaaForecast): 221 """Request forecast data from NOAA NBM FTP servers.""" 222 223 _url = "https://nomads.ncep.noaa.gov/pub/data/nccf/com/blend/prod/blend.{}/{}/text/blend_{}tx.t{}z" 224 _valid_types = ("nbh", "nbs", "nbe", "nbx") 225 226 @property 227 def _urls(self) -> Iterator[str]: 228 """Iterate through hourly updates no older than two days.""" 229 date = dt.datetime.now(tz=dt.timezone.utc) 230 cutoff = date - dt.timedelta(days=1) 231 while date > cutoff: 232 timestamp = date.strftime(r"%Y%m%d") 233 hour = str(date.hour).zfill(2) 234 yield self._url.format(timestamp, hour, self.report_type, hour) 235 date -= dt.timedelta(hours=1) 236 237 def _index_target(self, station: str) -> tuple[str, str]: 238 return f"{station} ", f"{self.report_type.upper()} GUIDANCE"
Request forecast data from NOAA NBM FTP servers.
38class Service: 39 """Base Service class for fetching reports.""" 40 41 report_type: str 42 _url: ClassVar[str] = "" 43 _valid_types: ClassVar[tuple[str, ...]] = () 44 45 def __init__(self, report_type: str): 46 if self._valid_types and report_type not in self._valid_types: 47 msg = f"'{report_type}' is not a valid report type for {self.__class__.__name__}. Expected {self._valid_types}" 48 raise ValueError(msg) 49 self.report_type = report_type 50 51 @property 52 def root(self) -> str | None: 53 """Return the service's root URL.""" 54 if self._url is None: 55 return None 56 url = self._url[self._url.find("//") + 2 :] 57 return url[: url.find("/")]
Base Service class for fetching reports.