Sports Field Registration via Keypoints-aware Label Condition

Camera Calibration / Multi-view Camera


  • A new deep learning framework for sports field registration using dense key points with an instance segmentation network.
  • Introduces a new large-scale dataset for camera calibration in football.
Semantic segmentation treats several items in a category as one. Instance segmentation recognizes items into categories.


Why do we want to do this?

  • Better camera calibration techniques.
  • Previous datasets were small and mostly closed to the public.

Why was this not done before?

This paper is an improvement on,

A Robust and Efficient Framework for Sports-Field Registration

but uses an instance segmentation architecture instead of a semantic segmentation architecture.


New Design Choices

Instance Segmentation / Dynamic Filter Learning

Dynamic Filter Learning Paper

DoDNet Architecture(
DoDNet Architecture(source)
I’m not convinced that dynamic filters are significant, but I don’t fully understand them.

Encoder-Decoder Architecture(U-Net)

U-Net Architecture (
U-Net Architecture (source)

Loss Functions

A weighted combination of the following.

  • Binary Dice Loss
  • Binary Cross Entropy
  • Weighted Cross Entropy Loss


Results are a little better/on par with previous state-of-the-art methods.

This note is a part of my paper notes series. You can find more here or on Twitter.