Data Parsers API#

Event Data Parsers#

load_instat_event_data(event_data_loc: str, metadata_loc: str) tuple[DataFrame, Metadata][source]#

This function retrieves the metadata and event data of a specific game. The x and y coordinates provided have been scaled to the dimensions of the pitch, with (0, 0) being the center. Additionally, the coordinates have been standardized so that the home team is represented as playing from left to right for the entire game, and the away team is represented as playing from right to left.

Parameters:
  • event_data_loc (str) – location of the event_data.json file

  • event_data_metadata_loc (str) – location of the metadata.json file

Returns:

the event data of the game and the metadata

Return type:

Tuple[pd.DataFrame, Metadata]

load_metrica_event_data(event_data_loc: str, metadata_loc: str) tuple[DataFrame, Metadata, dict][source]#

Function to load the metrica event data.

Parameters:
  • event_data_loc (str) – location of the event data .json file

  • metadata_loc (str) – location of the metadata .xml file

Raises:
  • TypeError – type error if event_data_loc, or metadata_loc is not a valid input

  • type (str)

Returns:

The event data and the metadata, and databallpy events

Return type:

Tuple[pd.DataFrame, Metadata, dict]

load_metrica_open_event_data() tuple[DataFrame, Metadata, dict][source]#

Function to load the open event data of metrica

Returns:

event data and metadata of the game and databallpy events

Return type:

Tuple[pd.DataFrame, Metadata]

load_scisports_event_data(events_json: str, pitch_dimensions: tuple = (106.0, 68.0)) tuple[DataFrame, Metadata, dict][source]#

This function retrieves the metadata and event data of a specific game. The x and y coordinates provided have been scaled to the dimensions of the pitch, with (0, 0) being the center. Additionally, the coordinates have been standardized so that the home team is represented as playing from left to right for the entire game, and the away team is represented as playing from right to left.

Parameters:
  • events_json (str) – location of the event.json file.

  • pitch_dimensions (tuple, optional) – the length and width of the pitch in meters

Returns:

the event data of the game, the metadata, and the databallpy_events.

Return type:

Tuple[pd.DataFrame, Metadata, dict]

load_sportec_event_data(event_data_loc: str, metadata_loc: str) tuple[DataFrame, Metadata, dict[str, dict]][source]#

Base function to load the sportec/DFL event data.

Parameters:
  • event_data_loc (str) – the location of the event data xml

  • metadata_loc (str) – the location of the tracking data xml

Raises:

FileNotFoundError – If the event or metadata location is not found

Returns:

The event data, the event

metadata, and the databallpy events dictionary.

Return type:

tuple[pd.DataFrame, Metadata, dict[str, dict]]

load_sportec_open_event_data(game_id: str, cache_path: Path) tuple[DataFrame, Metadata, dict[str, dict]][source]#

Function to (down)load on open game from Sportec/Tracab

Parameters:
  • game_id (str) – The id of the open game

  • cache_path (Path) – path to cache files.

Returns:

The event data, the event

metadata, and the databallpy events dictionary.

Return type:

tuple[pd.DataFrame, Metadata, dict[str, dict]]

Reference:

Bassek, M., Weber, H., Rein, R., & Memmert,D. (2024). An integrated dataset of synchronized spatiotemporal and event data in elite soccer.

load_statsbomb_event_data(events_loc: str, match_loc: str, lineup_loc: str, pitch_dimensions: tuple = (105.0, 68.0)) tuple[DataFrame, Metadata, dict][source]#

This function retrieves the metadata and event data of a specific game. The x and y coordinates provided have been scaled to the dimensions of the pitch, with (0, 0) being the center. Additionally, the coordinates have been standardized so that the home team is represented as playing from left to right for the entire game, and the away team is represented as playing from right to left.

Parameters:
  • events_loc (str) – location of the event.json file.

  • match_loc (str) – location of the game.json file.

  • lineup_loc (str) – location of the lineup.json file.

  • pitch_dimensions (tuple, optional) – the length and width of the pitch. Input should be in yards (as this is statsbomb standard (120, 80)) and is recalculated to meters in this function. Defaults to (105.0, 68.0)

Returns:

the event data of the gameh, the metadata, and the databallpy_events.

Return type:

Tuple[pd.DataFrame, Metadata, dict]

Tracking Data Parsers#

load_inmotio_tracking_data(tracking_data_loc: str, metadata_loc: str, verbose: bool = True) tuple[DataFrame, Metadata][source]#

Function to load inmotio tracking data.

Parameters:
  • tracking_data_loc (str) – location of the tracking data .txt file

  • metadata_loc (str) – location of the metadata .xml file

  • verbose (bool, optional) – whether to print information about the progress

  • True. (in the terminall. Defaults to)

Raises:

TypeError – if tracking_data_loc is not a string

Returns:

tracking and metadata of the game

Return type:

Tuple[pd.DataFrame, Metadata]

load_metrica_open_tracking_data(verbose: bool = True) tuple[DataFrame, Metadata][source]#

Function to load open dataset of metrica

Parameters:
  • verbose (bool) – Whether or not to print info in the terminal. Defaults to True.

  • cache_path (Path) – path to save the tracking file to as cache.

Returns:

tracking and metadata of the game

Return type:

Tuple[pd.DataFrame, Metadata]

load_metrica_tracking_data(tracking_data_loc: str, metadata_loc: str, verbose: bool = True) tuple[DataFrame, Metadata][source]#

Function to load metrica tracking data.

Parameters:
  • tracking_data_loc (str) – location of the tracking data .txt file

  • metadata_loc (str) – location of the metadata .xml file

  • verbose (bool, optional) – whether to print information about the progress

  • True. (in the terminall. Defaults to)

Raises:

TypeError – if tracking_data_loc is not a string or io.StringIO

Returns:

tracking and metadata of the game

Return type:

Tuple[pd.DataFrame, Metadata]

load_sportec_open_tracking_data(game_id: str, verbose: bool, cache_path: Path) tuple[DataFrame, Metadata][source]#

Load the tracking data from the sportec open data platform

Parameters:
  • game_id (str) – The id of the game

  • verbose (bool) – Whether to print info about the loading of the data.

  • cache_path (Path) – path to cache files.

Returns:

the tracking data and metadata class

Return type:

tuple[pd.DataFrame, Metadata]

Reference:

Bassek, M., Weber, H., Rein, R., & Memmert,D. (2024). An integrated dataset of synchronized spatiotemporal and event data in elite soccer.

load_tracab_tracking_data(tracab_loc: str, metadata_loc: str, verbose: bool = True) tuple[DataFrame, Metadata][source]#

Function to load tracking data and metadata from the tracab format

Parameters:
  • tracab_loc (str) – location of the tracking_data.dat file

  • metadata_loc (str) – location of the meta_data.xml file

  • verbose (bool) – whether to print on progress of loading in the terminal,

  • True (defaults to)

Returns:

the tracking data and metadata class

Return type:

Tuple[pd.DataFrame, Metadata]

Parse from Kloppy#

convert_kloppy_tracking_dataset(tracking_dataset: TrackingDataset, periods: DataFrame) TrackingData[source]#

Function to get all information of a game given kloppy dataset(s)

Parameters:
  • tracking_dataset (kloppy.domain.TrackingDataset, optional) – a Kloppy tracking dataset.

  • periods (pd.DataFrame) – DataFrame containing information about the periods

Returns:

All tracking data in a TrackingData object

Return type:

TrackingData

convert_kloppy_event_dataset(event_dataset: EventDataset, periods: DataFrame) EventData[source]#

Function to get all information of a game given kloppy dataset(s)

Parameters:
  • event_dataset (kloppy.domain.EventDataset, optional) – A Kloppy event dataset.

  • periods (pd.DataFrame) – DataFrame containing information about the periods

Returns:

All event data in an EventData object

Return type:

EventData