MarineCadastre.gov

Vessel Traffic Data

National AIS at 1 Minute Intervals

AIS Data for 2022

AIS_2022_01_01.zip, AIS_2022_01_02.zip, …, AIS_2022_12_31.zip

i.e. AIS_2022_(0[1-9]|1[0-2])_(0[1-9]|[12][0-9]|3[01])\.zip

Nationwide Automatic Identification System 2022

Data Dictionary PDF

Name Description Example Units Resolution Type Size
1 MMSI Maritime Mobile Service Identity value 477220100 Text 8
2 BaseDateTime Full UTC date and time 2017-02-01 20:05:07 YYYY-MM-DD:HH-MM-SS DateTime
3 LAT Latitude 42.35137 decimal degrees XX.XXXXX Double 8
4 LON Longitude -71.04182 decimal degrees XXX.XXXXX Double 8
5 SOG Speed Over Ground 5.9 knots XXX.X Float 4
6 COG Course Over Ground 47.5 degrees XXX.X Float 4
7 Heading True heading angle 45.1 degrees XXX.X Float 4
8 VesselName Name as shown on the station radio license OOCL Malaysia Text 32
9 IMO International Maritime Organization Vessel number IMO9627980 Text 16
10 CallSign Call sign as assigned by FCC VRME7 Text 8
11 VesselType Vessel type as defined in NAIS specifications 70 Integer short
12 Status Navigation status as defined by the COLREGS 3 Integer short
13 Length Length of vessel (see NAIS specifications) 71 meters XXX.X Float 4
14 Width Width of vessel (see NAIS specifications) 12 meters XXX.X Float 4
15 Draft Draft depth of vessel (see NAIS specifications) 3.5 meters XXX.X Float 4
16 Cargo Cargo type (see NAIS specification and codes) 70 Text 4
17 TransceiverClass Class of AIS transceiver A Text 2

AIS Fundamentals | Spire Maritime Documentation

Unix text processing

curl -O https://coast.noaa.gov/htdata/CMSP/AISDataHandler/2022/AIS_2022_06_20.zip
unzip AIS_2022_06_20.zip
head -n 5 AIS_2022_06_20.csv
MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselName,IMO,CallSign,VesselType,Status,Length,Width,Draft,Cargo,TransceiverClass
538009563,2022-06-20T00:00:04,29.23668,-116.63519,20.4,149.9,152.0,DEL MONTE PRIDE,IMO9869693,V7A4893,70,0,192,30,7.6,70,A
367481660,2022-06-20T00:00:05,38.58858,-90.19843,0.0,360.0,511.0,MIRANDA PAIGE,IMO8976578,WDF7156,31,0,21,9,,31,A
303200000,2022-06-20T00:00:06,36.85888,-76.34542,0.0,213.9,337.0,TAURUS,IMO7819498,WDB6361,31,0,22,7,,32,A
368011450,2022-06-20T00:00:06,29.61732,-89.89189,6.1,297.8,511.0,KRISTIN,,WDJ7927,31,0,17,7,,31,A
tail -n 5 AIS_2022_06_20.csv
303533000,2022-06-20T23:17:52,13.46087,144.66438,0.4,266.0,511.0,HURAO,IMO9277230,WDL4585,52,15,29,9,4.0,52,A
303533000,2022-06-20T23:21:21,13.46088,144.66436,0.0,244.7,511.0,HURAO,IMO9277230,WDL4585,52,15,29,9,4.0,52,A
303533000,2022-06-20T23:37:22,13.45743,144.65525,7.8,260.7,511.0,HURAO,IMO9277230,WDL4585,52,15,29,9,4.0,52,A
303533000,2022-06-20T23:44:22,13.45513,144.64894,0.7,215.1,511.0,HURAO,IMO9277230,WDL4585,52,15,29,9,4.0,52,A
303533000,2022-06-20T23:53:42,13.45187,144.64068,0.2,290.2,511.0,HURAO,IMO9277230,WDL4585,52,15,29,9,4.0,52,A

AWS S3

I started the cloud process using AWS S3. I then did it using Google Cloud Storage. I keep the documentation for AWS S3.

Using high-level (s3) commands with the AWS CLI

The size of each zip file we download is around 300 MB and decompressed the csv file is around 900 MB. There is not enough available space in my EC2 instance to do the following in EC2 Instance Connect.

Locally we run the following:

Download to a file named by the URL

This is the -O (uppercase letter o) option, or --remote-name for the long name version. The -O option selects the local file name to use by picking the file name part of the URL that you provide. This is important. You specify the URL and curl picks the name from this data. If the site redirects curl further (and if you tell curl to follow redirects), it does not change the file name curl will use for storing this.

for i in {21..27}; do \
curl -O https://coast.noaa.gov/htdata/CMSP/AISDataHandler/2022/AIS_2022_06_${i}.zip; \
unzip AIS_2022_06_${i}.zip; \
done
for i in {21..27}; do aws s3 cp AIS_2022_06_${i}.csv s3://jordanbell2357ais/; done

S3 bucket:

S3 bucket

Google Cloud Storage

Discover object storage with the gsutil tool

Using EC2 Instance Connect was not possible in my AWS configuration, because the size of each pair of zip file and csv file is over 1 GB in each case. On the other hand, in my configuration of Google Cloud Platform, the 5 GB available space is enough to store one by one the zip file and csv file.

We write out the steps for June 21, 2022, in a general way.

i=21 # 01-31
curl -O https://coast.noaa.gov/htdata/CMSP/AISDataHandler/2022/AIS_2022_06_${i}.zip
unzip AIS_2022_06_${i}.zip
gsutil cp AIS_2022_06_${i}.csv gs://jordanbell2357marinecadastre/
rm AIS_2022_06_${i}.zip
rm AIS_2022_06_${i}.csv

Relevant Google Cloud Self-Paced Labs (GSP): Cloud Storage: Qwik Start - CLI/SDK (GSP074), Ingesting Data Into The Cloud (GSP194), Ingesting New Datasets into BigQuery (GSP 411), Loading Your Own Data into BigQuery (GSP865).

We use bq now:

for i in {21..27}; do bq load --source_format=CSV --autodetect AIS_2022_06_21_to_27.AIS_2022_06_${i} gs://jordanbell2357marinecadastre/AIS_2022_06_${i}.csv; done

Now, to make sure we can do the same task multiple ways, we will do the above locally for June 20, 2022.

curl -O https://coast.noaa.gov/htdata/CMSP/AISDataHandler/2022/AIS_2022_06_20.zip
unzip AIS_2022_06_20.zip

If we now run

gsutil cp AIS_2022_06_20.csv gs://jordanbell2357/marinecadastre/

we get

ResumableUploadAbortException: 401 Anonymous caller does not have storage.objects.create access to the Google Cloud Storage object. Permission 'storage.objects.create' denied on resource (or it may not exist).

Initializing the gcloud CLI

Install gsutil

We run

gcloud init

and then run

gsutil cp AIS_2022_06_20.csv gs://jordanbell2357/marinecadastre/

with success. Then

bq load --source_format=CSV --autodetect AIS_2022_06_21_to_27.AIS_2022_06_20 gs://jordanbell2357marinecadastre/AIS_2022_06_20.csv

with success. Now we clean up,

rm AIS_2022_06_20.zip
rm AIS_2022_06_20.csv

BigQuery

Query syntax | BigQuery

(SELECT MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselType,Status,Length,Width FROM ais-data-385301.AIS_2022_06_21_to_27.AIS_2022_06_21)
UNION ALL
(SELECT MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselType,Status,Length,Width FROM ais-data-385301.AIS_2022_06_21_to_27.AIS_2022_06_22)
UNION ALL
(SELECT MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselType,Status,Length,Width FROM ais-data-385301.AIS_2022_06_21_to_27.AIS_2022_06_23)
UNION ALL
(SELECT MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselType,Status,Length,Width FROM ais-data-385301.AIS_2022_06_21_to_27.AIS_2022_06_24)
UNION ALL
(SELECT MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselType,Status,Length,Width FROM ais-data-385301.AIS_2022_06_21_to_27.AIS_2022_06_25)
UNION ALL
(SELECT MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselType,Status,Length,Width FROM ais-data-385301.AIS_2022_06_21_to_27.AIS_2022_06_26)
UNION ALL
(SELECT MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselType,Status,Length,Width FROM ais-data-385301.AIS_2022_06_21_to_27.AIS_2022_06_27)

We save the results to a BigQuery table, creating a new table we name AIS_2022_06_21_to_27 (ais-data-385301.AIS_2022_06_21_to_27.AIS_2022_06_21_to_27).