MAWI Working Group Traffic Archive


Packet traces from WIDE backbone

This is a traffic data repository maintained by the MAWI Working Group of the WIDE Project.

Currently, traffic traces are collected at the following sampling points:

samplepoint-F
daily traces at the transit link of WIDE to the upstream ISP, in operation since 2006/07/01: 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024.
longer traces: 48-hour-long traces on 2007/01/09-11, 72-hour-long traces on 2008/03/18-20, 96-hour-long traces on 2009/03/30-04/02, 83-hour-long traces on 2010/04/13-16, 63-hour-long traces on 2012/03/30-04/01, 72-hour-long traces on 2013/06/25-27, 24-hour-long traces on 2014/10/02, 2014/12/10, 2021/04/14, 2023/04/12, 48-hour-long traces on 2015/12/02-03, 2017/04/12-13, 2018/05/09-10, 2019/04/09-10, 2020/04/08-09, 2022/04/13-14 as part of a Day in the Life of the Internet project.
The link was upgraded from 100Mbps to 1Gbps with 150Mbps Committed Access Rate (CAR) on June 1 2007, and then, the CAR was officially removed on June 21, 2016.
Data is missing from November 4 to December 15 in 2022, due to a hardware failure in the middle of the POP migration.

WIDE started operation at another IX (BBIX) on July 31, 2023 and, as a result, the transit traffic was considerablly reduced. To restore the monitoring coverage, this IX traffic was added to the samplepoint-F on August 8, 2023. Now, the samplepoint-F contains the traffic from the 2 links, the transit link and the BBIX link.

Note: there are a considerable amount of duplicated packets in the traces from May 28 to September 3, 2015, due to a mis-configured VLAN at the monitored router. (A quick way to remove the duplicates is to use editcap in the wireshark distribution, e.g., "editcap -D64 infile outfile".)

Note about a large amount of ICMP traffic is in the traces since 2013, probing the entire IPv4 space by the USC ANT project. The probing started in September 2011 with sporadic probing, but changed to constant higher-rate probing since March 27, 2013. The probing once terminated on December 4, 2020, but restarted on July 2, 2023.

You can browse the traffic of this link using the agurim tool from here.

Older traces:

samplepoint-G (2018 May -- 2020 June)
weekly traces from the main IX link of WIDE to DIX-IE: 2018, 2019, 2020.
longer traces: 24-hour-long traces on 2018/05/09, and 2019/04/09, and an 8-hour-long trace on 2020/04/08.
samplepoint-A (terminated in Nov 2000)
daily trace of a trans-Pacific T1 line (one of them): 1999, 2000
Y2K roll-over
samplepoint-B
daily trace of another trans-Pacific line (18Mbps CAR on 100Mbps link) This link was terminated on 2006/07/01, and replaced by Samplepoint-F: 2000, 2001, 2002, 2003, 2004, 2005, 2006
107-hour-long traces from another US-Japan link taken from 1999/05/10 10pm to 1999/05/15 7am.
The link was upgraded from 4Mbps to 10Mbps on 5/13 around 15:30.
24-hour-long traces on 2003/02/27, 2005/01/07, 2005/09/22, and 2006/03/03.
plot graphs by aguri: port numbers: 2001, 2002, 2003, 2004, 2005, 2006 (wide-members only: addresses: 2001, 2002, 2003, 2004, 2005, 2006)
samplepoint-C
daily trace at an IPv6 line connected to 6Bone: 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
samplepoint-D
daily trace at an IPv6 line connected to WIDE-6Bone: 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
samplepoint-E
yet another US-Japan line (OC-3) which came into operation in Dec 2001:
plot graphs by aguri: port numbers: 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 (wide-members only: addresses: 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009)

Traffic traces are made by tcpdump, and then, IP addresses in the traces are scrambled by a modified version of tcpdpriv.
The soure code of a set of tools used to create our archive is also available.

MAWILab is a database that assists researchers to evaluate their traffic anomaly detection methods. It consists of a set of labels locating traffic anomalies in the MAWI archive (samplepoints B and F). The labels are obtained using an advanced graph-based methodology that compares and combines different and independent anomaly detectors. The data set is daily updated to include new traffic from upcoming applications and anomalies.

You may use WIDE traffic data for only research purposes. Actions that trespass upon users' privacy are prohibited.

Here is guidelines for protecting user privacy in WIDE traffic archive. (japanese version).

MAWI traffic archive Frequently Asked Questions.

Related Papers

  1. Kenjiro Cho.
    "Recursive Lattice Search: Hierarchical Heavy Hitters Revisited".
    ACM IMC 2017, London, UK, November 2017.
  2. Romain Fontugne, Patrice Abry, Kensuke Fukuda, Darryl Veitch, Kenjiro Cho, Pierre Borgnat, Herwig Wendt.
    :Scaling in Internet Traffic: a 14 year and 3 day longitudinal study, with multiscale analyses and random projections".
    IEEE/ACM Transactions on Networking, vol.25(4). pp2152--2165. August 2017.
  3. Midori Kato, Kenjiro Cho, Michio Honda, Hideyuki Tokuda.
    "Monitoring the Dynamics of Network Traffic by Recursive Multi-dimensional Aggregation".
    OSDI2012 MAD Workshop. Hollywood, CA. October 2012.
  4. R. Fontugne, P. Borgnat, P. Abry, K. Fukuda.
    "MAWILab: Combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking".
    ACM CoNEXT 2010. Philadelphia, PA. December 2010.
  5. Pierre Borgnat, Guillaume Dewaele, Kensuke Fukuda, Patrice Abry, Kenjiro Cho.
    "Seven Years and One Day: Sketching the Evolution of Internet Traffic."
    INFOCOM2009. Rio de Janeiro, Brazil. April 2009.
  6. Guilaume Dewaele, Kensuke Fukuda, Pierre Borgnat, Patrice Abry, Kenjiro Cho.
    "Extracting Hidden Anomalies using Sketch and Non Gaussian Multiresolution Statistical Detection Procedures".
    SIGCOMM2007 LSAD Workshop, Kyoto Japan. August 2007.
  7. Kenjiro Cho, Ryo Kaizaki and Akira Kato.
    "Aguri: An Aggregation-based Traffic Profiler".
    In Proceedings of QofIS2001 (published by Springer-Verlag in the LCNS series). September 2001.
  8. Kenjiro Cho, Koushirou Mitsuya and Akira Kato.
    "Traffic Data Repository at the WIDE Project".
    USENIX 2000 FREENIX Track, San Diego, CA, June 2000. (HTML version)
  9. Akira KATO, Jun MURAI, Satoshi KATSUNO and Tohru ASAMI.
    "An Internet Traffic Data Repository: The Architecture and the Design Policy".
    INET99, San Jose, CA, June 1999.

Links


contact info: Kenjiro Cho, WIDE Project (kjc at wide.ad.jp)