ninarg commited on
Commit
a67b01f
·
1 Parent(s): c5e72f6

Fix DFG tab: drop case-less events before pm4py conversion

Browse files

The native OCEL 2.0 events parquet has ~33% case-less events
(shared-resource activities); pm4py's discover_dfg requires a
case ID per event. Dropping them in load_data() so the first
tab renders cleanly without affecting the other tabs.

Files changed (1) hide show
  1. app.py +3 -2
app.py CHANGED
@@ -15,8 +15,9 @@ def load_data():
15
  ds = load_dataset("VynFi/vynfi-supply-chain-ocel", "events", split="train")
16
  df = ds.to_pandas()
17
  df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
18
- # Drop rows with NaT timestamps (pm4py can't handle them)
19
- df = df.dropna(subset=["timestamp"])
 
20
  # Rename for pm4py — use safe names without colons for display
21
  df = df.rename(columns={
22
  "case_id": "case_id_pm",
 
15
  ds = load_dataset("VynFi/vynfi-supply-chain-ocel", "events", split="train")
16
  df = ds.to_pandas()
17
  df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
18
+ # pm4py needs both timestamp and case_id present; native OCEL 2.0 has
19
+ # case-less events for shared-resource activities, drop them here.
20
+ df = df.dropna(subset=["timestamp", "case_id"])
21
  # Rename for pm4py — use safe names without colons for display
22
  df = df.rename(columns={
23
  "case_id": "case_id_pm",