avwx.service
Report Source Services
AVWX fetches the raw weather reports from third-party services via REST API calls or file downloads. We use Service objects to handle the request and extraction for us.
Basic Module Use
METARs and TAFs are the most widely-supported report types, so an effort has been made to localize some of these services to a regional source. The get_service function was introduced to determine the best service for a given station.
# Fetch Australian reports
station = 'YWOL'
country = 'AU' # can source from avwx.Station.country
# Get the station's preferred service and initialize to fetch METARs
service = avwx.service.get_service(station, country)('metar')
# service is now avwx.service.AUBOM init'd to fetch METARs
# Fetch the current METAR
report = service.fetch(station)
Other report types require specific service classes which are found in their respective submodules. However, you can normally let the report type classes handle these services for you.
Adding a New Service
If the existing services are not supplying the report(s) you need, adding a new service is easy. First, you'll need to determine if your source can be scraped or you need to download a file.
ScrapeService
For web scraping sources, you'll need to do the following things:
- Add the base URL and method (if not
"GET") - Implement the
ScrapeService._make_urlto return the source URL and query parameters - Implement the
ScrapeService._extractfunction to return just the report string (starting at the station ID) from the response
Let's look at the MAC service as an example:
class MAC(StationScrape):
"""Requests data from Meteorologia Aeronautica Civil for Columbian stations"""
_url = "http://meteorologia.aerocivil.gov.co/expert_text_query/parse"
method = "POST"
def _make_url(self, station: str) -> tuple[str, dict]:
"""Returns a formatted URL and parameters"""
return self._url, {"query": f"{self.report_type} {station}"}
def _extract(self, raw: str, station: str) -> str:
"""Extracts the report message using string finding"""
return self._simple_extract(raw, f"{station.upper()} ", "=")
Our URL and query parameters are returned using _make_url so fetch knows how to request the report. The result of this query is given to _extract which returns the report or list of reports.
Once your service is created, it can optionally be added to avwx.service.scrape.PREFERRED if the service covers all stations with a known ICAO prefix or avwx.service.scrape.BY_COUNTRY if the service covers all stations in a single country. This is how avwx.service.get_service determines the preferred service. For example, the MAC service is preferred over NOAA for all ICAOs starting with "SK" while AUBOM is better for all Australian stations.
FileService
For file-based sources, you'll need to do the following things:
- Add the base URL and valid report types
- Implement the
FileService._urlsto iterate through source URLs - Implement the
FileService._extractfunction to return just the report string (starting at the station ID) from the response
Let's look at the NOAA_NBM service as an example:
class NOAA_NBM(FileService):
"""Requests forecast data from NOAA NBM FTP servers"""
_url = "https://nomads.ncep.noaa.gov/pub/data/nccf/com/blend/prod/blend.{}/{}/text/blend_{}tx.t{}z"
_valid_types = ("nbh", "nbs", "nbe")
@property
def _urls(self) -> Iterator[str]:
"""Iterates through hourly updates no older than two days"""
date = dt.datetime.now(tz=dt.timezone.utc)
cutoff = date - dt.timedelta(days=1)
while date > cutoff:
timestamp = date.strftime(r"%Y%m%d")
hour = str(date.hour).zfill(2)
yield self.url.format(timestamp, hour, self.report_type, hour)
date -= dt.timedelta(hours=1)
def _extract(self, station: str, source: TextIO) -> Optional[str]:
"""Returns report pulled from the saved file"""
start = station + " "
end = self.report_type.upper() + " GUIDANCE"
txt = source.read()
txt = txt[txt.find(start) :]
txt = txt[: txt.find(end, 30)]
lines = []
for line in txt.split("\n"):
if "CLIMO" not in line:
line = line.strip()
if not line:
break
lines.append(line)
return "\n".join(lines) or None
In this example, we iterate through _urls looking for the most recent published file. URL iterators should always have a lower bound to stop iteration so the service can return a null response.
Once the file is downloaded, the requested station and file-like object are passed to the _extract method to find and return the report from the file. This method will not be called if the file doesn't exist.
1""".. include:: ../../docs/service.md""" 2 3from avwx.service.base import Service 4from avwx.service.files import NoaaGfs, NoaaNbm 5from avwx.service.scrape import ( 6 Amo, 7 Aubom, 8 Avt, 9 # FaaNotam, 10 Mac, 11 Nam, 12 Noaa, 13 Olbs, 14 get_service, 15) 16 17__all__ = ( 18 "get_service", 19 "Amo", 20 "Aubom", 21 "Avt", 22 "Mac", 23 "Nam", 24 "Noaa", 25 "NoaaGfs", 26 "NoaaNbm", 27 "Olbs", 28 "Service", 29 # "FaaNotam", 30)
611def get_service(station: str, country_code: str) -> ScrapeService: 612 """Return the preferred scrape service for a given station. 613 614 ```python 615 # Fetch Australian reports 616 station = "YWOL" 617 country = "AU" # can source from avwx.Station.country 618 # Get the station's preferred service and initialize to fetch METARs 619 service = avwx.service.get_service(station, country)("metar") 620 # service is now avwx.service.Aubom init'd to fetch METARs 621 # Fetch the current METAR 622 report = service.fetch(station) 623 ``` 624 """ 625 with suppress(KeyError): 626 return PREFERRED[station[:2]] # type: ignore 627 return BY_COUNTRY.get(country_code, Noaa) # type: ignore
Return the preferred scrape service for a given station.
# Fetch Australian reports
station = "YWOL"
country = "AU" # can source from avwx.Station.country
# Get the station's preferred service and initialize to fetch METARs
service = avwx.service.get_service(station, country)("metar")
# service is now avwx.service.Aubom init'd to fetch METARs
# Fetch the current METAR
report = service.fetch(station)
223class Amo(StationScrape): 224 """Request data from AMO KMA for Korean stations.""" 225 226 _url = "http://amoapi.kma.go.kr/amoApi/{}" 227 default_timeout = 60 228 229 def _make_url(self, station: str) -> tuple[str, dict]: 230 """Return a formatted URL and parameters.""" 231 return self._url.format(self.report_type), {"icao": station} 232 233 def _extract(self, raw: str, station: str) -> str: # noqa: ARG002 234 """Extract the report message from XML response.""" 235 resp = parsexml(raw) 236 try: 237 report = resp["response"]["body"]["items"]["item"][f"{self.report_type.lower()}Msg"] 238 except KeyError as key_error: 239 raise self._make_err(raw) from key_error 240 if not report: 241 msg = "The station might not exist" 242 raise self._make_err(msg) 243 # Replace line breaks 244 report = report.replace("\n", "") 245 # Remove excess leading and trailing data 246 for item in (self.report_type.upper(), "SPECI"): 247 if report.startswith(f"{item} "): 248 report = report[len(item) + 1 :] 249 report = report.rstrip("=") 250 # Make every element single-spaced and stripped 251 return " ".join(report.split())
Request data from AMO KMA for Korean stations.
274class Aubom(StationScrape): 275 """Request data from the Australian Bureau of Meteorology.""" 276 277 _url = "http://www.bom.gov.au/aviation/php/process.php" 278 method = "POST" 279 280 @staticmethod 281 def _make_headers() -> dict: 282 """Return request headers.""" 283 return { 284 "Content-Type": "application/x-www-form-urlencoded", 285 "Accept": "*/*", 286 "Accept-Language": "en-us", 287 "Accept-Encoding": "gzip, deflate", 288 "Host": "www.bom.gov.au", 289 "Origin": "http://www.bom.gov.au", 290 "User-Agent": secrets.choice(_USER_AGENTS), 291 "Connection": "keep-alive", 292 } 293 294 def _post_data(self, station: str) -> dict: 295 """Return the POST form.""" 296 return {"keyword": station, "type": "search", "page": "TAF"} 297 298 def _extract(self, raw: str, station: str) -> str: # noqa: ARG002 299 """Extract the reports from HTML response.""" 300 index = 1 if self.report_type == "taf" else 2 301 try: 302 report = raw.split("<p")[index] 303 report = report[report.find(">") + 1 :] 304 except IndexError as index_error: 305 msg = "The station might not exist" 306 raise self._make_err(msg) from index_error 307 if report.startswith("<"): 308 return "" 309 report = report[: report.find("</p>")] 310 return report.replace("<br />", " ")
Request data from the Australian Bureau of Meteorology.
361class Avt(StationScrape): 362 """Request data from AVT/XiamenAir for China. 363 NOTE: This should be replaced later with a gov+https source. 364 """ 365 366 _url = "http://www.avt7.com/Home/AirportMetarInfo?airport4Code=" 367 368 def _make_url(self, station: str) -> tuple[str, dict]: 369 """Return a formatted URL and empty parameters.""" 370 return self._url + station, {} 371 372 def _extract(self, raw: str, station: str) -> str: # noqa: ARG002 373 """Extract the reports from HTML response.""" 374 try: 375 data = json.loads(raw) 376 key = f"{self.report_type.lower()}ContentList" 377 text: str = data[key]["rows"][0]["content"] 378 except (TypeError, json.decoder.JSONDecodeError, KeyError, IndexError): 379 return "" 380 else: 381 return text
Request data from AVT/XiamenAir for China. NOTE: This should be replaced later with a gov+https source.
254class Mac(StationScrape): 255 """Request data from Meteorologia Aeronautica Civil for Columbian stations.""" 256 257 _url = "https://meteorologia.aerocivil.gov.co/expert_text_query/parse" 258 method = "POST" 259 260 @staticmethod 261 def _make_headers() -> dict: 262 """Return request headers.""" 263 return {"X-Requested-With": "XMLHttpRequest"} 264 265 def _post_data(self, station: str) -> dict: 266 """Return the POST form/data payload.""" 267 return {"query": f"{self.report_type} {station}"} 268 269 def _extract(self, raw: str, station: str) -> str: 270 """Extract the report message using string finding.""" 271 return self._simple_extract(raw, f"{station.upper()} ", "=")
Request data from Meteorologia Aeronautica Civil for Columbian stations.
342class Nam(StationScrape): 343 """Request data from NorthAviMet for North Atlantic and Nordic countries.""" 344 345 _url = "https://www.northavimet.com/NamConWS/rest/opmet/command/0/" 346 347 def _make_url(self, station: str) -> tuple[str, dict]: 348 """Return a formatted URL and empty parameters.""" 349 return self._url + station, {} 350 351 def _extract(self, raw: str, station: str) -> str: 352 """Extract the reports from HTML response.""" 353 starts = [f">{self.report_type.upper()} <", f">{station.upper()}<", "top'>"] 354 report = self._simple_extract(raw, starts, "=") 355 index = report.rfind(">") 356 if index > -1: 357 report = report[index + 1 :] 358 return f"{station} {report.strip()}"
Request data from NorthAviMet for North Atlantic and Nordic countries.
242class NoaaGfs(NoaaForecast): 243 """Request forecast data from NOAA GFS FTP servers.""" 244 245 _url = "https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfsmos.{}/mdl_gfs{}.t{}z" 246 _valid_types = ("mav", "mex") 247 248 _cycles: ClassVar[dict[str, tuple[int, ...]]] = {"mav": (0, 6, 12, 18), "mex": (0, 12)} 249 250 @property 251 def _urls(self) -> Iterator[str]: 252 """Iterate through update cycles no older than two days.""" 253 warnings.warn( 254 "GFS fetch has been deprecated due to NOAA retiring the format. Migrate to NBM for similar data", 255 DeprecationWarning, 256 stacklevel=2, 257 ) 258 now = dt.datetime.now(tz=dt.timezone.utc) 259 date = dt.datetime.now(tz=dt.timezone.utc) 260 cutoff = date - dt.timedelta(days=1) 261 while date > cutoff: 262 for cycle in reversed(self._cycles[self.report_type]): 263 date = date.replace(hour=cycle) 264 if date > now: 265 continue 266 timestamp = date.strftime(r"%Y%m%d") 267 hour = str(date.hour).zfill(2) 268 yield self._url.format(timestamp, self.report_type, hour) 269 date -= dt.timedelta(hours=1) 270 271 def _index_target(self, station: str) -> tuple[str, str]: 272 return f"{station} GFS", f"{self.report_type.upper()} GUIDANCE"
Request forecast data from NOAA GFS FTP servers.
221class NoaaNbm(NoaaForecast): 222 """Request forecast data from NOAA NBM FTP servers.""" 223 224 _url = "https://nomads.ncep.noaa.gov/pub/data/nccf/com/blend/prod/blend.{}/{}/text/blend_{}tx.t{}z" 225 _valid_types = ("nbh", "nbs", "nbe", "nbx") 226 227 @property 228 def _urls(self) -> Iterator[str]: 229 """Iterate through hourly updates no older than two days.""" 230 date = dt.datetime.now(tz=dt.timezone.utc) 231 cutoff = date - dt.timedelta(days=1) 232 while date > cutoff: 233 timestamp = date.strftime(r"%Y%m%d") 234 hour = str(date.hour).zfill(2) 235 yield self._url.format(timestamp, hour, self.report_type, hour) 236 date -= dt.timedelta(hours=1) 237 238 def _index_target(self, station: str) -> tuple[str, str]: 239 return f"{station} ", f"{self.report_type.upper()} GUIDANCE"
Request forecast data from NOAA NBM FTP servers.
313class Olbs(StationScrape): 314 """Request data from India OLBS flight briefing.""" 315 316 # _url = "https://olbs.amsschennai.gov.in/nsweb/FlightBriefing/showopmetquery.php" 317 # method = "POST" 318 319 # Temp redirect 320 _url = "https://avbrief3.el.r.appspot.com/api/report" 321 322 def _make_url(self, station: str) -> tuple[str, dict]: 323 """Return a formatted URL and empty parameters.""" 324 return self._url, {"icao": station} 325 326 def _post_data(self, station: str) -> dict: 327 """Return the POST form.""" 328 # Can set icaos to "V*" to return all results 329 return {"icaos": station, "type": self.report_type} 330 331 def _extract(self, raw: str, station: str) -> str: # noqa: ARG002 332 """Extract the reports from HTML response.""" 333 try: 334 data = json.loads(raw.strip()) 335 text: str = data[f"raw{self.report_type.lower()}"] 336 except (TypeError, json.decoder.JSONDecodeError, KeyError, IndexError): 337 return "" 338 else: 339 return text
Request data from India OLBS flight briefing.
38class Service: 39 """Base Service class for fetching reports.""" 40 41 report_type: str 42 _url: ClassVar[str] = "" 43 _valid_types: ClassVar[tuple[str, ...]] = () 44 45 def __init__(self, report_type: str): 46 if self._valid_types and report_type not in self._valid_types: 47 msg = f"'{report_type}' is not a valid report type for {self.__class__.__name__}. Expected {self._valid_types}" 48 raise ValueError(msg) 49 self.report_type = report_type 50 51 @property 52 def root(self) -> str | None: 53 """Return the service's root URL.""" 54 if self._url is None: 55 return None 56 url = self._url[self._url.find("//") + 2 :] 57 return url[: url.find("/")]
Base Service class for fetching reports.