Conventional FWI using the least-squares norm ($L_2$) as a misfit function is known to suffer from cycle skipping. We proposed the quadratic Wasserstein metric ($W_2$) as a new misfit function for FWI. It has been proved to have many ideal properties with regards to convexity and insensitivity to noise. Unlike the $L_2$ norm, $W_2$ measures not only amplitude differences, but also global phase shifts, which helps to avoid cycle skipping issues. We propose two ways of using the $W_2$ metric in FWI: trace-by-trace comparison and global comparison. Numerical results on synthetic models and field data demonstrate the promising future of this new misfit function.