Data Parsers API#
Event Data Parsers#
- load_instat_event_data(event_data_loc: str, metadata_loc: str) tuple[DataFrame, Metadata][source]#
This function retrieves the metadata and event data of a specific game. The x and y coordinates provided have been scaled to the dimensions of the pitch, with (0, 0) being the center. Additionally, the coordinates have been standardized so that the home team is represented as playing from left to right for the entire game, and the away team is represented as playing from right to left.
- Parameters:
event_data_loc (str) – location of the event_data.json file
event_data_metadata_loc (str) – location of the metadata.json file
- Returns:
the event data of the game and the metadata
- Return type:
Tuple[pd.DataFrame, Metadata]
- load_metrica_event_data(event_data_loc: str, metadata_loc: str) tuple[DataFrame, Metadata, dict][source]#
Function to load the metrica event data.
- Parameters:
event_data_loc (str) – location of the event data .json file
metadata_loc (str) – location of the metadata .xml file
- Raises:
TypeError – type error if event_data_loc, or metadata_loc is not a valid input
type (str) –
- Returns:
The event data and the metadata, and databallpy events
- Return type:
Tuple[pd.DataFrame, Metadata, dict]
- load_metrica_open_event_data() tuple[DataFrame, Metadata, dict][source]#
Function to load the open event data of metrica
- Returns:
event data and metadata of the game and databallpy events
- Return type:
Tuple[pd.DataFrame, Metadata]
- load_scisports_event_data(events_json: str, pitch_dimensions: tuple = (106.0, 68.0)) tuple[DataFrame, Metadata, dict][source]#
This function retrieves the metadata and event data of a specific game. The x and y coordinates provided have been scaled to the dimensions of the pitch, with (0, 0) being the center. Additionally, the coordinates have been standardized so that the home team is represented as playing from left to right for the entire game, and the away team is represented as playing from right to left.
- Parameters:
events_json (str) – location of the event.json file.
pitch_dimensions (tuple, optional) – the length and width of the pitch in meters
- Returns:
the event data of the game, the metadata, and the databallpy_events.
- Return type:
Tuple[pd.DataFrame, Metadata, dict]
- load_sportec_event_data(event_data_loc: str, metadata_loc: str) tuple[DataFrame, Metadata, dict[str, dict]][source]#
Base function to load the sportec/DFL event data.
- Parameters:
event_data_loc (str) – the location of the event data xml
metadata_loc (str) – the location of the tracking data xml
- Raises:
FileNotFoundError – If the event or metadata location is not found
- Returns:
- The event data, the event
metadata, and the databallpy events dictionary.
- Return type:
tuple[pd.DataFrame, Metadata, dict[str, dict]]
- load_sportec_open_event_data(game_id: str, cache_path: Path) tuple[DataFrame, Metadata, dict[str, dict]][source]#
Function to (down)load on open game from Sportec/Tracab
- Parameters:
game_id (str) – The id of the open game
cache_path (Path) – path to cache files.
- Returns:
- The event data, the event
metadata, and the databallpy events dictionary.
- Return type:
tuple[pd.DataFrame, Metadata, dict[str, dict]]
- Reference:
Bassek, M., Weber, H., Rein, R., & Memmert,D. (2024). An integrated dataset of synchronized spatiotemporal and event data in elite soccer.
- load_statsbomb_event_data(events_loc: str, match_loc: str, lineup_loc: str, pitch_dimensions: tuple = (105.0, 68.0)) tuple[DataFrame, Metadata, dict][source]#
This function retrieves the metadata and event data of a specific game. The x and y coordinates provided have been scaled to the dimensions of the pitch, with (0, 0) being the center. Additionally, the coordinates have been standardized so that the home team is represented as playing from left to right for the entire game, and the away team is represented as playing from right to left.
- Parameters:
events_loc (str) – location of the event.json file.
match_loc (str) – location of the game.json file.
lineup_loc (str) – location of the lineup.json file.
pitch_dimensions (tuple, optional) – the length and width of the pitch. Input should be in yards (as this is statsbomb standard (120, 80)) and is recalculated to meters in this function. Defaults to (105.0, 68.0)
- Returns:
the event data of the gameh, the metadata, and the databallpy_events.
- Return type:
Tuple[pd.DataFrame, Metadata, dict]
Tracking Data Parsers#
- load_inmotio_tracking_data(tracking_data_loc: str, metadata_loc: str, verbose: bool = True) tuple[DataFrame, Metadata][source]#
Function to load inmotio tracking data.
- Parameters:
tracking_data_loc (str) – location of the tracking data .txt file
metadata_loc (str) – location of the metadata .xml file
verbose (bool, optional) – whether to print information about the progress
True. (in the terminall. Defaults to)
- Raises:
TypeError – if tracking_data_loc is not a string
- Returns:
tracking and metadata of the game
- Return type:
Tuple[pd.DataFrame, Metadata]
- load_metrica_open_tracking_data(verbose: bool = True) tuple[DataFrame, Metadata][source]#
Function to load open dataset of metrica
- Parameters:
verbose (bool) – Whether or not to print info in the terminal. Defaults to True.
cache_path (Path) – path to save the tracking file to as cache.
- Returns:
tracking and metadata of the game
- Return type:
Tuple[pd.DataFrame, Metadata]
- load_metrica_tracking_data(tracking_data_loc: str, metadata_loc: str, verbose: bool = True) tuple[DataFrame, Metadata][source]#
Function to load metrica tracking data.
- Parameters:
tracking_data_loc (str) – location of the tracking data .txt file
metadata_loc (str) – location of the metadata .xml file
verbose (bool, optional) – whether to print information about the progress
True. (in the terminall. Defaults to)
- Raises:
TypeError – if tracking_data_loc is not a string or io.StringIO
- Returns:
tracking and metadata of the game
- Return type:
Tuple[pd.DataFrame, Metadata]
- load_sportec_open_tracking_data(game_id: str, verbose: bool, cache_path: Path) tuple[DataFrame, Metadata][source]#
Load the tracking data from the sportec open data platform
- Parameters:
game_id (str) – The id of the game
verbose (bool) – Whether to print info about the loading of the data.
cache_path (Path) – path to cache files.
- Returns:
the tracking data and metadata class
- Return type:
tuple[pd.DataFrame, Metadata]
- Reference:
Bassek, M., Weber, H., Rein, R., & Memmert,D. (2024). An integrated dataset of synchronized spatiotemporal and event data in elite soccer.
- load_tracab_tracking_data(tracab_loc: str, metadata_loc: str, verbose: bool = True) tuple[DataFrame, Metadata][source]#
Function to load tracking data and metadata from the tracab format
- Parameters:
tracab_loc (str) – location of the tracking_data.dat file
metadata_loc (str) – location of the meta_data.xml file
verbose (bool) – whether to print on progress of loading in the terminal,
True (defaults to)
- Returns:
the tracking data and metadata class
- Return type:
Tuple[pd.DataFrame, Metadata]
Parse from Kloppy#
- convert_kloppy_tracking_dataset(tracking_dataset: TrackingDataset, periods: DataFrame) TrackingData[source]#
Function to get all information of a game given kloppy dataset(s)
- Parameters:
tracking_dataset (kloppy.domain.TrackingDataset, optional) – a Kloppy tracking dataset.
periods (pd.DataFrame) – DataFrame containing information about the periods
- Returns:
All tracking data in a TrackingData object
- Return type:
TrackingData
- convert_kloppy_event_dataset(event_dataset: EventDataset, periods: DataFrame) EventData[source]#
Function to get all information of a game given kloppy dataset(s)
- Parameters:
event_dataset (kloppy.domain.EventDataset, optional) – A Kloppy event dataset.
periods (pd.DataFrame) – DataFrame containing information about the periods
- Returns:
All event data in an EventData object
- Return type:
EventData