avwx.service

Report Source Services

AVWX fetches the raw weather reports from third-party services via REST API calls or file downloads. We use Service objects to handle the request and extraction for us.

Basic Module Use

METARs and TAFs are the most widely-supported report types, so an effort has been made to localize some of these services to a regional source. The get_service function was introduced to determine the best service for a given station.

# Fetch Australian reports
station = 'YWOL'
country = 'AU' # can source from avwx.Station.country
# Get the station's preferred service and initialize to fetch METARs
service = avwx.service.get_service(station, country)('metar')
# service is now avwx.service.AUBOM init'd to fetch METARs
# Fetch the current METAR
report = service.fetch(station)

Other report types require specific service classes which are found in their respective submodules. However, you can normally let the report type classes handle these services for you.

Adding a New Service

If the existing services are not supplying the report(s) you need, adding a new service is easy. First, you'll need to determine if your source can be scraped or you need to download a file.

ScrapeService

For web scraping sources, you'll need to do the following things:

  • Add the base URL and method (if not "GET")
  • Implement the ScrapeService._make_url to return the source URL and query parameters
  • Implement the ScrapeService._extract function to return just the report string (starting at the station ID) from the response

Let's look at the MAC service as an example:

class MAC(StationScrape):
    """Requests data from Meteorologia Aeronautica Civil for Columbian stations"""

    _url = "http://meteorologia.aerocivil.gov.co/expert_text_query/parse"
    method = "POST"

    def _make_url(self, station: str) -> tuple[str, dict]:
        """Returns a formatted URL and parameters"""
        return self._url, {"query": f"{self.report_type} {station}"}

    def _extract(self, raw: str, station: str) -> str:
        """Extracts the report message using string finding"""
        return self._simple_extract(raw, f"{station.upper()} ", "=")

Our URL and query parameters are returned using _make_url so fetch knows how to request the report. The result of this query is given to _extract which returns the report or list of reports.

Once your service is created, it can optionally be added to avwx.service.scrape.PREFERRED if the service covers all stations with a known ICAO prefix or avwx.service.scrape.BY_COUNTRY if the service covers all stations in a single country. This is how avwx.service.get_service determines the preferred service. For example, the MAC service is preferred over NOAA for all ICAOs starting with "SK" while AUBOM is better for all Australian stations.

FileService

For file-based sources, you'll need to do the following things:

  • Add the base URL and valid report types
  • Implement the FileService._urls to iterate through source URLs
  • Implement the FileService._extract function to return just the report string (starting at the station ID) from the response

Let's look at the NOAA_NBM service as an example:

class NOAA_NBM(FileService):
    """Requests forecast data from NOAA NBM FTP servers"""

    _url = "https://nomads.ncep.noaa.gov/pub/data/nccf/com/blend/prod/blend.{}/{}/text/blend_{}tx.t{}z"
    _valid_types = ("nbh", "nbs", "nbe")

    @property
    def _urls(self) -> Iterator[str]:
        """Iterates through hourly updates no older than two days"""
        date = dt.datetime.now(tz=dt.timezone.utc)
        cutoff = date - dt.timedelta(days=1)
        while date > cutoff:
            timestamp = date.strftime(r"%Y%m%d")
            hour = str(date.hour).zfill(2)
            yield self.url.format(timestamp, hour, self.report_type, hour)
            date -= dt.timedelta(hours=1)

    def _extract(self, station: str, source: TextIO) -> Optional[str]:
        """Returns report pulled from the saved file"""
        start = station + "   "
        end = self.report_type.upper() + " GUIDANCE"
        txt = source.read()
        txt = txt[txt.find(start) :]
        txt = txt[: txt.find(end, 30)]
        lines = []
        for line in txt.split("\n"):
            if "CLIMO" not in line:
                line = line.strip()
            if not line:
                break
            lines.append(line)
        return "\n".join(lines) or None

In this example, we iterate through _urls looking for the most recent published file. URL iterators should always have a lower bound to stop iteration so the service can return a null response.

Once the file is downloaded, the requested station and file-like object are passed to the _extract method to find and return the report from the file. This method will not be called if the file doesn't exist.

 1""".. include:: ../../docs/service.md"""
 2
 3from avwx.service.base import Service
 4from avwx.service.files import NoaaGfs, NoaaNbm
 5from avwx.service.scrape import (
 6    Amo,
 7    Aubom,
 8    Avt,
 9    # FaaNotam,
10    Mac,
11    Nam,
12    Noaa,
13    Olbs,
14    get_service,
15)
16
17__all__ = (
18    "get_service",
19    "Amo",
20    "Aubom",
21    "Avt",
22    "Mac",
23    "Nam",
24    "Noaa",
25    "NoaaGfs",
26    "NoaaNbm",
27    "Olbs",
28    "Service",
29    # "FaaNotam",
30)
def get_service(station: str, country_code: str) -> avwx.service.scrape.ScrapeService:
611def get_service(station: str, country_code: str) -> ScrapeService:
612    """Return the preferred scrape service for a given station.
613
614    ```python
615    # Fetch Australian reports
616    station = "YWOL"
617    country = "AU"  # can source from avwx.Station.country
618    # Get the station's preferred service and initialize to fetch METARs
619    service = avwx.service.get_service(station, country)("metar")
620    # service is now avwx.service.Aubom init'd to fetch METARs
621    # Fetch the current METAR
622    report = service.fetch(station)
623    ```
624    """
625    with suppress(KeyError):
626        return PREFERRED[station[:2]]  # type: ignore
627    return BY_COUNTRY.get(country_code, Noaa)  # type: ignore

Return the preferred scrape service for a given station.

# Fetch Australian reports
station = "YWOL"
country = "AU"  # can source from avwx.Station.country
# Get the station's preferred service and initialize to fetch METARs
service = avwx.service.get_service(station, country)("metar")
# service is now avwx.service.Aubom init'd to fetch METARs
# Fetch the current METAR
report = service.fetch(station)
class Amo(avwx.service.scrape.StationScrape):
223class Amo(StationScrape):
224    """Request data from AMO KMA for Korean stations."""
225
226    _url = "http://amoapi.kma.go.kr/amoApi/{}"
227    default_timeout = 60
228
229    def _make_url(self, station: str) -> tuple[str, dict]:
230        """Return a formatted URL and parameters."""
231        return self._url.format(self.report_type), {"icao": station}
232
233    def _extract(self, raw: str, station: str) -> str:  # noqa: ARG002
234        """Extract the report message from XML response."""
235        resp = parsexml(raw)
236        try:
237            report = resp["response"]["body"]["items"]["item"][f"{self.report_type.lower()}Msg"]
238        except KeyError as key_error:
239            raise self._make_err(raw) from key_error
240        if not report:
241            msg = "The station might not exist"
242            raise self._make_err(msg)
243        # Replace line breaks
244        report = report.replace("\n", "")
245        # Remove excess leading and trailing data
246        for item in (self.report_type.upper(), "SPECI"):
247            if report.startswith(f"{item} "):
248                report = report[len(item) + 1 :]
249        report = report.rstrip("=")
250        # Make every element single-spaced and stripped
251        return " ".join(report.split())

Request data from AMO KMA for Korean stations.

default_timeout = 60
class Aubom(avwx.service.scrape.StationScrape):
274class Aubom(StationScrape):
275    """Request data from the Australian Bureau of Meteorology."""
276
277    _url = "http://www.bom.gov.au/aviation/php/process.php"
278    method = "POST"
279
280    @staticmethod
281    def _make_headers() -> dict:
282        """Return request headers."""
283        return {
284            "Content-Type": "application/x-www-form-urlencoded",
285            "Accept": "*/*",
286            "Accept-Language": "en-us",
287            "Accept-Encoding": "gzip, deflate",
288            "Host": "www.bom.gov.au",
289            "Origin": "http://www.bom.gov.au",
290            "User-Agent": secrets.choice(_USER_AGENTS),
291            "Connection": "keep-alive",
292        }
293
294    def _post_data(self, station: str) -> dict:
295        """Return the POST form."""
296        return {"keyword": station, "type": "search", "page": "TAF"}
297
298    def _extract(self, raw: str, station: str) -> str:  # noqa: ARG002
299        """Extract the reports from HTML response."""
300        index = 1 if self.report_type == "taf" else 2
301        try:
302            report = raw.split("<p")[index]
303            report = report[report.find(">") + 1 :]
304        except IndexError as index_error:
305            msg = "The station might not exist"
306            raise self._make_err(msg) from index_error
307        if report.startswith("<"):
308            return ""
309        report = report[: report.find("</p>")]
310        return report.replace("<br />", " ")

Request data from the Australian Bureau of Meteorology.

method = 'POST'
class Avt(avwx.service.scrape.StationScrape):
361class Avt(StationScrape):
362    """Request data from AVT/XiamenAir for China.
363    NOTE: This should be replaced later with a gov+https source.
364    """
365
366    _url = "http://www.avt7.com/Home/AirportMetarInfo?airport4Code="
367
368    def _make_url(self, station: str) -> tuple[str, dict]:
369        """Return a formatted URL and empty parameters."""
370        return self._url + station, {}
371
372    def _extract(self, raw: str, station: str) -> str:  # noqa: ARG002
373        """Extract the reports from HTML response."""
374        try:
375            data = json.loads(raw)
376            key = f"{self.report_type.lower()}ContentList"
377            text: str = data[key]["rows"][0]["content"]
378        except (TypeError, json.decoder.JSONDecodeError, KeyError, IndexError):
379            return ""
380        else:
381            return text

Request data from AVT/XiamenAir for China. NOTE: This should be replaced later with a gov+https source.

class Mac(avwx.service.scrape.StationScrape):
254class Mac(StationScrape):
255    """Request data from Meteorologia Aeronautica Civil for Columbian stations."""
256
257    _url = "https://meteorologia.aerocivil.gov.co/expert_text_query/parse"
258    method = "POST"
259
260    @staticmethod
261    def _make_headers() -> dict:
262        """Return request headers."""
263        return {"X-Requested-With": "XMLHttpRequest"}
264
265    def _post_data(self, station: str) -> dict:
266        """Return the POST form/data payload."""
267        return {"query": f"{self.report_type} {station}"}
268
269    def _extract(self, raw: str, station: str) -> str:
270        """Extract the report message using string finding."""
271        return self._simple_extract(raw, f"{station.upper()} ", "=")

Request data from Meteorologia Aeronautica Civil for Columbian stations.

method = 'POST'
class Nam(avwx.service.scrape.StationScrape):
342class Nam(StationScrape):
343    """Request data from NorthAviMet for North Atlantic and Nordic countries."""
344
345    _url = "https://www.northavimet.com/NamConWS/rest/opmet/command/0/"
346
347    def _make_url(self, station: str) -> tuple[str, dict]:
348        """Return a formatted URL and empty parameters."""
349        return self._url + station, {}
350
351    def _extract(self, raw: str, station: str) -> str:
352        """Extract the reports from HTML response."""
353        starts = [f">{self.report_type.upper()} <", f">{station.upper()}<", "top'>"]
354        report = self._simple_extract(raw, starts, "=")
355        index = report.rfind(">")
356        if index > -1:
357            report = report[index + 1 :]
358        return f"{station} {report.strip()}"

Request data from NorthAviMet for North Atlantic and Nordic countries.

Noaa = <class 'avwx.service.scrape.NoaaApi'>
class NoaaGfs(avwx.service.files.NoaaForecast):
242class NoaaGfs(NoaaForecast):
243    """Request forecast data from NOAA GFS FTP servers."""
244
245    _url = "https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfsmos.{}/mdl_gfs{}.t{}z"
246    _valid_types = ("mav", "mex")
247
248    _cycles: ClassVar[dict[str, tuple[int, ...]]] = {"mav": (0, 6, 12, 18), "mex": (0, 12)}
249
250    @property
251    def _urls(self) -> Iterator[str]:
252        """Iterate through update cycles no older than two days."""
253        warnings.warn(
254            "GFS fetch has been deprecated due to NOAA retiring the format. Migrate to NBM for similar data",
255            DeprecationWarning,
256            stacklevel=2,
257        )
258        now = dt.datetime.now(tz=dt.timezone.utc)
259        date = dt.datetime.now(tz=dt.timezone.utc)
260        cutoff = date - dt.timedelta(days=1)
261        while date > cutoff:
262            for cycle in reversed(self._cycles[self.report_type]):
263                date = date.replace(hour=cycle)
264                if date > now:
265                    continue
266                timestamp = date.strftime(r"%Y%m%d")
267                hour = str(date.hour).zfill(2)
268                yield self._url.format(timestamp, self.report_type, hour)
269            date -= dt.timedelta(hours=1)
270
271    def _index_target(self, station: str) -> tuple[str, str]:
272        return f"{station}   GFS", f"{self.report_type.upper()} GUIDANCE"

Request forecast data from NOAA GFS FTP servers.

class NoaaNbm(avwx.service.files.NoaaForecast):
221class NoaaNbm(NoaaForecast):
222    """Request forecast data from NOAA NBM FTP servers."""
223
224    _url = "https://nomads.ncep.noaa.gov/pub/data/nccf/com/blend/prod/blend.{}/{}/text/blend_{}tx.t{}z"
225    _valid_types = ("nbh", "nbs", "nbe", "nbx")
226
227    @property
228    def _urls(self) -> Iterator[str]:
229        """Iterate through hourly updates no older than two days."""
230        date = dt.datetime.now(tz=dt.timezone.utc)
231        cutoff = date - dt.timedelta(days=1)
232        while date > cutoff:
233            timestamp = date.strftime(r"%Y%m%d")
234            hour = str(date.hour).zfill(2)
235            yield self._url.format(timestamp, hour, self.report_type, hour)
236            date -= dt.timedelta(hours=1)
237
238    def _index_target(self, station: str) -> tuple[str, str]:
239        return f"{station}   ", f"{self.report_type.upper()} GUIDANCE"

Request forecast data from NOAA NBM FTP servers.

class Olbs(avwx.service.scrape.StationScrape):
313class Olbs(StationScrape):
314    """Request data from India OLBS flight briefing."""
315
316    # _url = "https://olbs.amsschennai.gov.in/nsweb/FlightBriefing/showopmetquery.php"
317    # method = "POST"
318
319    # Temp redirect
320    _url = "https://avbrief3.el.r.appspot.com/api/report"
321
322    def _make_url(self, station: str) -> tuple[str, dict]:
323        """Return a formatted URL and empty parameters."""
324        return self._url, {"icao": station}
325
326    def _post_data(self, station: str) -> dict:
327        """Return the POST form."""
328        # Can set icaos to "V*" to return all results
329        return {"icaos": station, "type": self.report_type}
330
331    def _extract(self, raw: str, station: str) -> str:  # noqa: ARG002
332        """Extract the reports from HTML response."""
333        try:
334            data = json.loads(raw.strip())
335            text: str = data[f"raw{self.report_type.lower()}"]
336        except (TypeError, json.decoder.JSONDecodeError, KeyError, IndexError):
337            return ""
338        else:
339            return text

Request data from India OLBS flight briefing.

class Service:
38class Service:
39    """Base Service class for fetching reports."""
40
41    report_type: str
42    _url: ClassVar[str] = ""
43    _valid_types: ClassVar[tuple[str, ...]] = ()
44
45    def __init__(self, report_type: str):
46        if self._valid_types and report_type not in self._valid_types:
47            msg = f"'{report_type}' is not a valid report type for {self.__class__.__name__}. Expected {self._valid_types}"
48            raise ValueError(msg)
49        self.report_type = report_type
50
51    @property
52    def root(self) -> str | None:
53        """Return the service's root URL."""
54        if self._url is None:
55            return None
56        url = self._url[self._url.find("//") + 2 :]
57        return url[: url.find("/")]

Base Service class for fetching reports.

Service(report_type: str)
45    def __init__(self, report_type: str):
46        if self._valid_types and report_type not in self._valid_types:
47            msg = f"'{report_type}' is not a valid report type for {self.__class__.__name__}. Expected {self._valid_types}"
48            raise ValueError(msg)
49        self.report_type = report_type
report_type: str
root: str | None
51    @property
52    def root(self) -> str | None:
53        """Return the service's root URL."""
54        if self._url is None:
55            return None
56        url = self._url[self._url.find("//") + 2 :]
57        return url[: url.find("/")]

Return the service's root URL.