Tech Blog

CV / Object Detection

2207.10758 DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection

Arxiv Link: http://arxiv.org/abs/2207.10758

Github Link: https://github.com/abhi1kumar/deviant

Authors: Abhinav Kumar Garrick Brazil Enrique Corona Armin Parchami Xiaoming Liu

Tag: object detection neural network

2207.10774 Focused Decoding Enables 3D Anatomical Detection by Transforme

Arxiv Link: http://arxiv.org/abs/2207.10774

Github Link: https://github.com/bwittmann/transoar

Authors: Bastian Wittmann Fernando Navarro Suprosanna Shit Bjoern Menze

Tag: object detection attention mechanism vision transformer transformer encoder encoder-decoder architecture

2207.10936 Long-tailed Instance Segmentation using Gumbel Optimized Lo

Arxiv Link: http://arxiv.org/abs/2207.10936

Github Link: https://github.com/kostas1515/gol

Authors: Konstantinos Panagiotis Alexandridis Jiankang Deng Anh Nguyen Shan Luo

Tag: instance segmentation object detection

2207.11031 MobileDenseNet: A new approach to object detection on mobile device

Arxiv Link: http://arxiv.org/abs/2207.11031

Github Link: https://github.com/hajizadeh/mobiledensenet

Authors: Mohammad Hajizadeh Mohammad Sabokrou Adel Rahmani

Tag: real-time pascal voc source code object detection

2207.11103 DeVIS: Making Deformable Transformers Work for Video Instance Segmentation

Arxiv Link: http://arxiv.org/abs/2207.11103

Github Link: https://github.com/acaelles97/devis

Authors: Adrià Caelles Tim Meinhardt Guillem Brasó Laura Leal-Taixé

Tag: video instance segmentation video sequence feature map object detection instance segmentation segmentation task

2207.11184 Multi-Faceted Distillation of Base-Novel Commonality for Few-shot Object Detection

Arxiv Link: http://arxiv.org/abs/2207.11184

Github Link: https://github.com/wushuang1998/mfdc

Authors: Shuang Wu Wenjie Pei Dianwen Mei Fanglin Chen Jiandong Tian

Tag: memory bank object detection object detector

2207.08531 ID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection

Arxiv Link: http://arxiv.org/abs/2207.08531

Github Link: https://github.com/spengliang/did-m3d

Authors: Liang Peng Xiaopei Wu Zheng Yang Haifeng Liu Deng Cai

Tag: depth estimation object detection data augmentation ablation study

2207.11368 Neural-Sim: Learning to Generate Training Data with NeRF

Arxiv Link: http://arxiv.org/abs/2207.11368

Github Link: https://github.com/gyhandy/neural-sim-nerf

Authors: Yunhao Ge Harkirat Behl Jiashu Xu Suriya Gunasekar Neel Joshi

Tag: computer vision model computer vision detection task object detection vision model application scenario synthetic data

2207.11455 UC-OWOD: Unknown-Classified Open World Object Detection

Arxiv Link: http://arxiv.org/abs/2207.11455

Github Link: https://github.com/johnwuzh/uc-owod

Authors: Zhiheng Wu Yue Lu Xingyu Chen Zhengxing Wu Liwen Kang

Tag: object detection computer vision object detector

2207.11753 Label-Guided Auxiliary Training Improves 3D Object Detecto

Arxiv Link: http://arxiv.org/abs/2207.11753

Github Link: https://github.com/fabiencode/lg3d

Authors: Yaomin Huang Xinmei Liu Yichen Zhu Zhiyuan Xu Chaomin Shen

Tag: object detection point cloud object detector bounding box

2207.12654 ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object Detection

Arxiv Link: http://arxiv.org/abs/2207.12654

Github Link: https://github.com/yinjunbo/proposalcontrast

Authors: Junbo Yin Dingfu Zhou Liangjun Zhang Jin Fang Cheng-Zhong Xu

Tag: object detection point cloud receptive field

2207.12655 Semi-supervised 3D Object Detection with Proficient Teache

Arxiv Link: http://arxiv.org/abs/2207.12655

Github Link: https://github.com/yinjunbo/proficientteachers

Authors: Junbo Yin Jin Fang Dingfu Zhou Liangjun Zhang Cheng-Zhong Xu

Tag: teacher model object detection contrastive learning supervised learning point cloud pseudo label autonomous driving object detector

2207.12716 MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbone

Arxiv Link: http://arxiv.org/abs/2207.12716

Github Link: https://github.com/tai-wang/depth-from-motion

Authors: Tai Wang Qing Lian Chenming Zhu Xinge Zhu Wenwei Zhang

Tag: technical report object detection

2207.12988 Monocular 3D Object Detection with Depth from Motion

Arxiv Link: http://arxiv.org/abs/2207.12988

Github Link: https://github.com/tai-wang/depth-from-motion

Authors: Tai Wang Jiangmiao Pang Dahua Lin

Tag: depth estimation object detection camera pose

2207.13339 ALBench: A Framework for Evaluating Active Learning in Object Detection

Arxiv Link: http://arxiv.org/abs/2207.13339

Github Link: https://github.com/industryessentials/ymir

Authors: Zhanpeng Feng Shiliang Zhang Rinyoichi Takezoe Wenze Hu Manmohan Chandraker

Tag: neural architecture search active learning neural network architecture neural network object detection image classification network architecture neural architecture

2207.13362 mouflaged Object Detection via Context-aware Cross-level Fusion

Arxiv Link: http://arxiv.org/abs/2207.13362

Github Link: https://github.com/ben57882/c2fnet-tscvt

Authors: Geng Chen Si-Jie Liu Yu-Jia Sun Ge-Peng Ji Ya-Feng Wu

Tag: benchmark dataset object detection feature representation

2207.14172 Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale Feature Fusion

Arxiv Link: http://arxiv.org/abs/2207.14172

Github Link: https://github.com/zhanggongjie/sam-detr

Authors: Gongjie Zhang Zhipeng Luo Yingchen Yu Jiaxing Huang Kaiwen Cui

Tag: object detection detection performance feature fusion detection task

2207.14221 Humans disagree with the IoU for measuring object detector localization erro

Arxiv Link: http://arxiv.org/abs/2207.14221

Github Link: https://github.com/ombretta/humans_vs_iou

Authors: Ombretta Strafforello Vanathi Rajasekart Osman S. Kayhan Oana Inel Jan van Gemert

Tag: object detector

2207.14284 HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolution

Arxiv Link: http://arxiv.org/abs/2207.14284

Github Link: https://github.com/raoyongming/hornet

Authors: Yongming Rao Wenliang Zhao Yansong Tang Jie Zhou Ser-Nam Lim

Tag: self-attention object detection swin transformer imagenet classification semantic segmentation vision transformer ## CV / 3D

2207.10762 MeshLoc: Mesh-Based Visual Localization

Arxiv Link: http://arxiv.org/abs/2207.10762

Github Link: https://github.com/tsattler/meshloc_release

Authors: Vojtech Panek Zuzana Kukelova Torsten Sattler

Tag: pose estimation feature matching autonomous robot point cloud camera pose 3d model

2207.11554 Towards Open Set 3D Learning: A Benchmark on Object Point Cloud

Arxiv Link: http://arxiv.org/abs/2207.11554

Github Link: https://github.com/antoalli/3d_os

Authors: Antonio Alliegro Francesco Cappio Borlino Tatiana Tommasi

Tag: autonomous system point cloud safety-critical application 3d model

2207.11790 PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation

Arxiv Link: http://arxiv.org/abs/2207.11790

Github Link: https://github.com/gitbosun/patchrd

Authors: Bo Sun Vladimir G. Kim Noam Aigerman Qixing Huang Siddhartha Chaudhuri

Tag: generative method 3d shape neural network

2207.12485 3D Shape Sequence of Human Comparison and Classification using Current and Varifold

Arxiv Link: http://arxiv.org/abs/2207.12485

Github Link: https://github.com/cristal-3dsam/humancomparisonvarifolds

Authors: Emery Pierson Mohamed Daoudi Sylvain Arguillere

Tag: 3d shape ## CV / Point Cloud

2207.10762 MeshLoc: Mesh-Based Visual Localization

Arxiv Link: http://arxiv.org/abs/2207.10762

Github Link: https://github.com/tsattler/meshloc_release

Authors: Vojtech Panek Zuzana Kukelova Torsten Sattler

Tag: pose estimation feature matching autonomous robot point cloud camera pose 3d model

2207.11484 GraphFit: Learning Multi-scale Graph-Convolutional Representation for Point Cloud Normal Estimation

Arxiv Link: http://arxiv.org/abs/2207.11484

Github Link: https://github.com/uestcjay/graphfit

Authors: Keqiang Li Mingyang Zhao Huaiyu Wu Dong-Ming Yan Zhen Shen

Tag: 3d point cloud benchmark dataset attention mechanism point cloud feature representation

2207.11554 Towards Open Set 3D Learning: A Benchmark on Object Point Cloud

Arxiv Link: http://arxiv.org/abs/2207.11554

Github Link: https://github.com/antoalli/3d_os

Authors: Antonio Alliegro Francesco Cappio Borlino Tatiana Tommasi

Tag: autonomous system point cloud safety-critical application 3d model

2207.11753 Label-Guided Auxiliary Training Improves 3D Object Detecto

Arxiv Link: http://arxiv.org/abs/2207.11753

Github Link: https://github.com/fabiencode/lg3d

Authors: Yaomin Huang Xinmei Liu Yichen Zhu Zhiyuan Xu Chaomin Shen

Tag: object detection point cloud object detector bounding box

2207.12654 ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object Detection

Arxiv Link: http://arxiv.org/abs/2207.12654

Github Link: https://github.com/yinjunbo/proposalcontrast

Authors: Junbo Yin Dingfu Zhou Liangjun Zhang Jin Fang Cheng-Zhong Xu

Tag: object detection point cloud receptive field

2207.12655 Semi-supervised 3D Object Detection with Proficient Teache

Arxiv Link: http://arxiv.org/abs/2207.12655

Github Link: https://github.com/yinjunbo/proficientteachers

Authors: Junbo Yin Jin Fang Dingfu Zhou Liangjun Zhang Cheng-Zhong Xu

Tag: teacher model object detection contrastive learning supervised learning point cloud pseudo label autonomous driving object detector

2207.12691 ENet: Toward Concise and Efficient LiDAR Semantic Segmentation for Autonomous Driving

Arxiv Link: http://arxiv.org/abs/2207.12691

Github Link: https://github.com/huixiancheng/cenet

Authors: Hui-Xian Cheng Xian-Feng Han Guo-Qiang Xiao

Tag: segmentation network activation function semantic segmentation point cloud autonomous driving lidar point cloud

2207.14268 MonteBoxFinder: Detecting and Filtering Primitives to Fit a Noisy Point Cloud

Arxiv Link: http://arxiv.org/abs/2207.14268

Github Link: https://github.com/michaelramamonjisoa/monteboxfinder

Authors: Michaël Ramamonjisoa Sinisa Stekovic Vincent Lepetit

Tag: point cloud optimization algorithm ## CV / Pose Estimation

2207.10762 MeshLoc: Mesh-Based Visual Localization

Arxiv Link: http://arxiv.org/abs/2207.10762

Github Link: https://github.com/tsattler/meshloc_release

Authors: Vojtech Panek Zuzana Kukelova Torsten Sattler

Tag: pose estimation feature matching autonomous robot point cloud camera pose 3d model

2207.10955 Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection

Arxiv Link: http://arxiv.org/abs/2207.10955

Github Link: https://github.com/AlvinYH/Faster-VoxelPose

Authors: Hang Ye Wentao Zhu Chunyu Wang Rujie Wu Yizhou Wang

Tag: real-time pose estimation human pose estimation bounding box

2207.12537 Live Stream Temporally Embedded 3D Human Body Pose and Shape Estimation

Arxiv Link: http://arxiv.org/abs/2207.12537

Github Link: https://github.com/ostadabbas/tepose

Authors: Zhouping Wang Sarah Ostadabbas

Tag: motion estimation human pose estimation pose estimation graph convolutional network real-time convolutional network adversarial training human body human behavior

2207.13264 Instance-specific 6-DoF Object Pose Estimation from Minimal Annotation

Arxiv Link: http://arxiv.org/abs/2207.13264

Github Link: https://github.com/rohanpsingh/objectkeypointtrainer

Authors: Rohan Pratap Singh Iori Kumagai Antonio Gabas Mehdi Benallegue Yusuke Yoshiyasu

Tag: deep neural network neural network pose estimation lighting condition camera pose

2207.13784 AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing

Arxiv Link: http://arxiv.org/abs/2207.13784

Github Link: https://github.com/eth-siplab/avatarposer

Authors: Jiaxi Jiang Paul Streli Huajian Qiu Andreas Fender Larissa Laich

Tag: pose estimation real-time transformer encoder learning-based method ## RL / Reinforcement Learning

2207.10763 Sim-to-real Deep Reinforcement Learning for Comparing Low-cost High-Resolution Robot Touch

Arxiv Link: http://arxiv.org/abs/2207.10763

Github Link: https://github.com/ac-93/tactile_gym

Authors: Yijiong Lin John Lloyd Alex Church Nathan F. Lepora

Tag: deep reinforcement learning reinforcement learning

2207.11432 Driver Dojo: A Benchmark for Generalizable Reinforcement Learning for Autonomous Driving

Arxiv Link: http://arxiv.org/abs/2207.11432

Github Link: https://github.com/seawee1/driver-dojo

Authors: Sebastian Rietsch Shih-Yuan Huang Georgios Kontes Axel Plinge Christopher Mutschler

Tag: autonomous driving reinforcement learning

2207.11584 Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning

Arxiv Link: http://arxiv.org/abs/2207.11584

Github Link: https://github.com/ucl-dark/skillhack

Authors: Michael Matthews Mikayel Samvelyan Jack Parker-Holder Edward Grefenstette Tim Rocktäschel

Tag: reinforcement learning inductive bias sparse reward reward function rl agent

2207.12644 Learning Bipedal Walking On Planned Footsteps For Humanoid Robot

Arxiv Link: http://arxiv.org/abs/2207.12644

Github Link: https://github.com/rohanpsingh/learninghumanoidwalking

Authors: Rohan Pratap Singh Mehdi Benallegue Mitsuharu Morisawa Rafael Cisneros Fumio Kanehiro

Tag: curriculum learning deep reinforcement learning reinforcement learning simulation environment

2207.13249 AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation

Arxiv Link: http://arxiv.org/abs/2207.13249

Github Link: https://github.com/crazorback/aadg

Authors: Junyan Lyu Yiqi Zhang Yijin Huang Li Lin Pujin Cheng

Tag: source code reinforcement learning convolutional neural network generalization performance neural network segmentation task deep reinforcement learning domain generalization data augmentation image segmentation medical image medical image segmentation adversarial training

2207.13453 Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithm

Arxiv Link: http://arxiv.org/abs/2207.13453

Github Link: https://github.com/baturaysaglam/dase

Authors: Baturay Saglam Dogan C. Cicek Furkan B. Mutlu Suleyman S. Kozat

Tag: deep reinforcement learning reinforcement learning ## CV / Transformer

2207.10774 Focused Decoding Enables 3D Anatomical Detection by Transforme

Arxiv Link: http://arxiv.org/abs/2207.10774

Github Link: https://github.com/bwittmann/transoar

Authors: Bastian Wittmann Fernando Navarro Suprosanna Shit Bjoern Menze

Tag: object detection attention mechanism vision transformer transformer encoder encoder-decoder architecture

2207.11860 Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation

Arxiv Link: http://arxiv.org/abs/2207.11860

Github Link: https://github.com/jamycheung/trans4pass

Authors: Jiaming Zhang Kailun Yang Hao Shi Simon Reiß Kunyu Peng

Tag: unsupervised domain adaptation semantic segmentation domain adaptation vision transformer

2207.14284 HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolution

Arxiv Link: http://arxiv.org/abs/2207.14284

Github Link: https://github.com/raoyongming/hornet

Authors: Yongming Rao Wenliang Zhao Yansong Tang Jie Zhou Ser-Nam Lim

Tag: self-attention object detection swin transformer imagenet classification semantic segmentation vision transformer ## CV / Medical Image

2207.10804 Suppressing Poisoning Attacks on Federated Learning for Medical Imaging

Arxiv Link: http://arxiv.org/abs/2207.10804

Github Link: https://github.com/naiftt/spafd

Authors: Naif Alkhunaizi Dmitry Kamzolov Martin Takáč Karthik Nandakumar

Tag: federated learning medical imaging privacy concern

2207.11553 High-Resolution Swin Transformer for Automatic Medical Image Segmentation

Arxiv Link: http://arxiv.org/abs/2207.11553

Github Link: https://github.com/auroua/hrstnet

Authors: Chen Wei Shenghan Ren Kaitai Guo Haihong Hu Jimin Liang

Tag: u-net feature map transformer network swin transformer image segmentation medical image medical image segmentation

2207.11581 Self-Supervised Learning of Echocardiogram Videos Enables Data-Efficient Clinical Diagno

Arxiv Link: http://arxiv.org/abs/2207.11581

Github Link: https://github.com/cards-yale/echo-ssl-aortic-stenosis

Authors: Gregory Holste Evangelos K. Oikonomou Bobak Mortazavi Zhangyang Wang Rohan Khera

Tag: supervised learning deep learning medical image transfer learning representation learning self-supervised learning

2207.11683 PCA: Semi-supervised Segmentation with Patch Confidence Adversarial Training

Arxiv Link: http://arxiv.org/abs/2207.11683

Github Link: https://github.com/HiLab-git/SSL4MIS

Authors: Zihang Xu Zhenghua Xu Shuo Zhang Thomas Lukasiewicz

Tag: segmentation performance supervised learning semantic information deep learning classification result image segmentation medical image medical image segmentation adversarial training

2207.12872 Generalized Probabilistic U-Net for medical image segementation

Arxiv Link: http://arxiv.org/abs/2207.12872

Github Link: https://github.com/ishaanb92/generalizedprobabilisticunet

Authors: Ishaan Bhat Josien P.W. Pluim Hugo J. Kuijf

Tag: u-net medical image gaussian distribution latent space

2207.13249 AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation

Arxiv Link: http://arxiv.org/abs/2207.13249

Github Link: https://github.com/crazorback/aadg

Authors: Junyan Lyu Yiqi Zhang Yijin Huang Li Lin Pujin Cheng

Tag: source code reinforcement learning convolutional neural network generalization performance neural network segmentation task deep reinforcement learning domain generalization data augmentation image segmentation medical image medical image segmentation adversarial training

2207.13415 TransNorm: Transformer Provides a Strong Spatial Normalization Mechanism for a Deep Segmentation Model

Arxiv Link: http://arxiv.org/abs/2207.13415

Github Link: https://github.com/rezazad68/transnorm

Authors: Reza Azad Mohammad T. AL-Antary Moein Heidari Dorit Merhof

Tag: self-attention convolutional neural network neural network u-net segmentation task attention mechanism feature fusion segmentation model image segmentation medical image medical image segmentation ## GNN / Graph Network

2207.10860 Transformer with Implicit Edges for Particle-based Physics Simulation

Arxiv Link: http://arxiv.org/abs/2207.10860

Github Link: https://github.com/ftbabi/tie_eccv2022

Authors: Yidi Shao Chen Change Loy Bo Dai

Tag: self-attention graph neural network attention module neural network

2207.11088 Layer-refined Graph Convolutional Networks for Recommendation

Arxiv Link: http://arxiv.org/abs/2207.11088

Github Link: https://github.com/enoche/imrec

Authors: Xin Zhou Donghui Lin Yong Liu Chunyan Miao

Tag: fixed point graph convolutional network convolutional network topological structure

2207.11247 Panoptic Scene Graph Generation

Arxiv Link: http://arxiv.org/abs/2207.11247

Github Link: https://github.com/Jingkang50/OpenPSG

Authors: Jingkang Yang Yi Zhe Ang Zujin Guo Kaiyang Zhou Wayne Zhang

Tag: bounding box panoptic segmentation graph representation

2207.12537 Live Stream Temporally Embedded 3D Human Body Pose and Shape Estimation

Arxiv Link: http://arxiv.org/abs/2207.12537

Github Link: https://github.com/ostadabbas/tepose

Authors: Zhouping Wang Sarah Ostadabbas

Tag: motion estimation human pose estimation pose estimation graph convolutional network real-time convolutional network adversarial training human body human behavior

2207.11996 enerative Subgraph Contrast for Self-Supervised Graph Representation Learning

Arxiv Link: http://arxiv.org/abs/2207.11996

Github Link: https://github.com/yh-han/gsc

Authors: Yuehui Han Le Hui Haobo Jiang Jianjun Qian Jin Xie

Tag: graph representation contrastive loss node classification benchmark dataset contrastive learning wasserstein distance representation learning learning framework

2207.13262 Factorial User Modeling with Hierarchical Graph Neural Network for Enhanced Sequential Recommendation

Arxiv Link: http://arxiv.org/abs/2207.13262

Github Link: https://github.com/xlx0010/hgnn

Authors: Lyuxin Xue Deqing Yang Yanghua Xiao

Tag: user preference graph neural network neural network ## CV / Instance Segmentation

2207.10936 Long-tailed Instance Segmentation using Gumbel Optimized Lo

Arxiv Link: http://arxiv.org/abs/2207.10936

Github Link: https://github.com/kostas1515/gol

Authors: Konstantinos Panagiotis Alexandridis Jiankang Deng Anh Nguyen Shan Luo

Tag: instance segmentation object detection

2207.11103 DeVIS: Making Deformable Transformers Work for Video Instance Segmentation

Arxiv Link: http://arxiv.org/abs/2207.11103

Github Link: https://github.com/acaelles97/devis

Authors: Adrià Caelles Tim Meinhardt Guillem Brasó Laura Leal-Taixé

Tag: video instance segmentation video sequence feature map object detection instance segmentation segmentation task

2207.12824 ompositional Human-Scene Interaction Synthesis with Semantic Control

Arxiv Link: http://arxiv.org/abs/2207.12824

Github Link: https://github.com/zkf1997/coins

Authors: Kaifeng Zhao Shaofei Wang Yan Zhang Thabo Beeler Siyu Tang

Tag: generative model instance segmentation latent space human body ## CV / Semantic Segmentation

2207.11102 Physiology-based simulation of the retinal vasculature enables annotation-free segmentation of OCT angiograph

Arxiv Link: http://arxiv.org/abs/2207.11102

Github Link: https://github.com/iMED-Lab/OCTA-Net-OCTA-Vessel-Segmentation-Network

Authors: Martin J. Menten Johannes C. Paetzold Alina Dima Bjoern H. Menze Benjamin Knier

Tag: semantic segmentation learning-based method synthetic data

2207.11549 Self-Support Few-Shot Semantic Segmentation

Arxiv Link: http://arxiv.org/abs/2207.11549

Github Link: https://github.com/fanq15/ssp

Authors: Qi Fan Wenjie Pei Yu-Wing Tai Chi-Keung Tang

Tag: semantic segmentation

2207.11722 Semantic-guided Multi-Mask Image Harmonization

Arxiv Link: http://arxiv.org/abs/2207.11722

Github Link: https://github.com/xuqianren/semantic-guided-multi-mask-image-harmonization

Authors: Xuqian Ren Yifan Liu

Tag: semantic segmentation segmentation mask

2207.12546 The Bearable Lightness of Big Data: Towards Massive Public Datasets in Scientific Machine Learning

Arxiv Link: http://arxiv.org/abs/2207.12546

Github Link: https://github.com/ihmegroup/lossy_ml

Authors: Wai Tong Chung Ki Sung Jung Jacqueline H. Chen Matthias Ihme

Tag: semantic segmentation deep learning

2207.12691 ENet: Toward Concise and Efficient LiDAR Semantic Segmentation for Autonomous Driving

Arxiv Link: http://arxiv.org/abs/2207.12691

Github Link: https://github.com/huixiancheng/cenet

Authors: Hui-Xian Cheng Xian-Feng Han Guo-Qiang Xiao

Tag: segmentation network activation function semantic segmentation point cloud autonomous driving lidar point cloud

2207.13297 GPS-GLASS: Learning Nighttime Semantic Segmentation Using Daytime Video and GPS dat

Arxiv Link: http://arxiv.org/abs/2207.13297

Github Link: https://github.com/jimmy9704/gps-glass

Authors: Hongjae Lee Changwoo Han Seung-Won Jung

Tag: segmentation network source code semantic segmentation autonomous driving video frame

2207.13600 Lightweight and Progressively-Scalable Networks for Semantic Segmentation

Arxiv Link: http://arxiv.org/abs/2207.13600

Github Link: https://github.com/yihengzhang-cv/lps-net

Authors: Yiheng Zhang Ting Yao Zhaofan Qiu Tao Mei

Tag: semantic segmentation learning framework

2207.11860 Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation

Arxiv Link: http://arxiv.org/abs/2207.11860

Github Link: https://github.com/jamycheung/trans4pass

Authors: Jiaming Zhang Kailun Yang Hao Shi Simon Reiß Kunyu Peng

Tag: unsupervised domain adaptation semantic segmentation domain adaptation vision transformer

2207.14284 HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolution

Arxiv Link: http://arxiv.org/abs/2207.14284

Github Link: https://github.com/raoyongming/hornet

Authors: Yongming Rao Wenliang Zhao Yansong Tang Jie Zhou Ser-Nam Lim

Tag: self-attention object detection swin transformer imagenet classification semantic segmentation vision transformer ## CV / Data Augmentation

2207.08531 ID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection

Arxiv Link: http://arxiv.org/abs/2207.08531

Github Link: https://github.com/spengliang/did-m3d

Authors: Liang Peng Xiaopei Wu Zheng Yang Haifeng Liu Deng Cai

Tag: depth estimation object detection data augmentation ablation study

2207.12535 Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

Arxiv Link: http://arxiv.org/abs/2207.12535

Github Link: https://github.com/xinleihe/semi-leak

Authors: Xinlei He Hongbin Liu Neil Zhenqiang Gong Yang Zhang

Tag: supervised learning data augmentation data privacy

2207.12757 ontrollable User Dialogue Act Augmentation for Dialogue State Tracking

Arxiv Link: http://arxiv.org/abs/2207.12757

Github Link: https://github.com/miulab/cuda-dst

Authors: Chun-Mao Lai Ming-Hao Hsu Chao-Wei Huang Yun-Nung Chen

Tag: data augmentation dialogue state tracking

2207.11984 RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation

Arxiv Link: http://arxiv.org/abs/2207.11984

Github Link: https://github.com/hmhemu/ra-depth

Authors: Mu He Le Hui Yikai Bian Jian Ren Jin Xie

Tag: depth estimation data augmentation data augmentation method

2207.13249 AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation

Arxiv Link: http://arxiv.org/abs/2207.13249

Github Link: https://github.com/crazorback/aadg

Authors: Junyan Lyu Yiqi Zhang Yijin Huang Li Lin Pujin Cheng

Tag: source code reinforcement learning convolutional neural network generalization performance neural network segmentation task deep reinforcement learning domain generalization data augmentation image segmentation medical image medical image segmentation adversarial training

2207.14255 Efficient Training of Language Models to Fill in the Middle

Arxiv Link: http://arxiv.org/abs/2207.14255

Github Link: https://github.com/openai/human-eval-infilling

Authors: Mohammad Bavarian Heewoo Jun Nikolas Tezak John Schulman Christine McLeavey

Tag: data augmentation language model autoregressive language model ## NLP / Relation Extraction

2207.11433 Enhancing Document-level Relation Extraction by Entity Knowledge Injection

Arxiv Link: http://arxiv.org/abs/2207.11433

Github Link: https://github.com/nju-websoft/kire

Authors: Xinyi Wang Zitao Wang Weijian Sun Wei Hu

Tag: relation extraction knowledge graph benchmark dataset ## NLP / Knowledge Graph

2207.11433 Enhancing Document-level Relation Extraction by Entity Knowledge Injection

Arxiv Link: http://arxiv.org/abs/2207.11433

Github Link: https://github.com/nju-websoft/kire

Authors: Xinyi Wang Zitao Wang Weijian Sun Wei Hu

Tag: relation extraction knowledge graph benchmark dataset

2207.11436 Facing Changes: Continual Entity Alignment for Growing Knowledge Graph

Arxiv Link: http://arxiv.org/abs/2207.11436

Github Link: https://github.com/nju-websoft/contea

Authors: Yuxin Wang Yuanning Cui Wenqiang Liu Zequn Sun Yiqiao Jiang

Tag: knowledge graph

2207.11442 \(\mu\text{KG}\): A Library for Multi-source Knowledge Graph Embeddings and Application

Arxiv Link: http://arxiv.org/abs/2207.11442

Github Link: https://github.com/nju-websoft/mukg

Authors: Xindi Luo Zequn Sun Wei Hu

Tag: link prediction downstream task benchmark dataset question answering knowledge graph deep learning representation learning

2207.12888 LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

Arxiv Link: http://arxiv.org/abs/2207.12888

Github Link: https://github.com/hackerchenzhuo/LaKo

Authors: Zhuo Chen Yufeng Huang Jiaoyan Chen Yuxia Geng Yin Fang

Tag: language model question answering knowledge graph text generation ## NLP / Question Answering

2207.11442 \(\mu\text{KG}\): A Library for Multi-source Knowledge Graph Embeddings and Application

Arxiv Link: http://arxiv.org/abs/2207.11442

Github Link: https://github.com/nju-websoft/mukg

Authors: Xindi Luo Zequn Sun Wei Hu

Tag: link prediction downstream task benchmark dataset question answering knowledge graph deep learning representation learning

2207.12576 WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Model

Arxiv Link: http://arxiv.org/abs/2207.12576

Github Link: https://github.com/winogavil/winogavil-experiments

Authors: Yonatan Bitton Nitzan Bitton Guetta Ron Yosef Yuval Elovici Mohit Bansal

Tag: question answering language model

2207.12888 LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

Arxiv Link: http://arxiv.org/abs/2207.12888

Github Link: https://github.com/hackerchenzhuo/LaKo

Authors: Zhuo Chen Yufeng Huang Jiaoyan Chen Yuxia Geng Yin Fang

Tag: language model question answering knowledge graph text generation

2207.13332 RealTime QA: What's the Answer Right Now?

Arxiv Link: http://arxiv.org/abs/2207.13332

Github Link: https://github.com/realtimeqa/realtimeqa_public

Authors: Jungo Kasai Keisuke Sakaguchi Yoichi Takahashi Ronan Le Bras Akari Asai

Tag: information retrieval question answering language model real-time pretrained language model qa system ## NLP / Transformer

2207.11553 High-Resolution Swin Transformer for Automatic Medical Image Segmentation

Arxiv Link: http://arxiv.org/abs/2207.11553

Github Link: https://github.com/auroua/hrstnet

Authors: Chen Wei Shenghan Ren Kaitai Guo Haihong Hu Jimin Liang

Tag: u-net feature map transformer network swin transformer image segmentation medical image medical image segmentation

2207.11808 ArmanEmo: A Persian Dataset for Text-based Emotion Detection

Arxiv Link: http://arxiv.org/abs/2207.11808

Github Link: https://github.com/arman-rayan-sharif/arman-text-emotion

Authors: Hossein Mirzaee (1) Javad Peymanfard (2) Hamid Habibzadeh Moshtaghin (3) Hossein Zeinali (1) ((1) Amirkabir University of Technology (2) Iran University of Science and Technology

Tag: transformer-based language model transfer learning language model

2207.12661 Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training

Arxiv Link: http://arxiv.org/abs/2207.12661

Github Link: https://github.com/hxyou/msclip

Authors: Haoxuan You Luowei Zhou Bin Xiao Noel Codella Yu Cheng

Tag: downstream task vision task multiple modality visual representation imagenet classification transformer model ## NLP / Pretrained language model

2207.13332 RealTime QA: What's the Answer Right Now?

Arxiv Link: http://arxiv.org/abs/2207.13332

Github Link: https://github.com/realtimeqa/realtimeqa_public

Authors: Jungo Kasai Keisuke Sakaguchi Yoichi Takahashi Ronan Le Bras Akari Asai

Tag: information retrieval question answering language model real-time pretrained language model qa system ## NLP / Word Embedding

2207.13842 Dive into Machine Learning Algorithms for Influenza Virus Host Prediction with Hemagglutinin Sequence

Arxiv Link: http://arxiv.org/abs/2207.13842

Github Link: https://github.com/dkdjb/iav_host_prediction

Authors: Yanhua Xu Dominik Wojtczak

Tag: word embedding neural network ## Others

2207.10719 Synthetic Dataset Generation for Adversarial Machine Learning Research

Arxiv Link: http://arxiv.org/abs/2207.10719

Github Link: https://github.com/carla-simulator/carla

Authors: Xiruo Liu Shibani Singh Cory Cornelius Colin Busho Mike Tan

Tag: synthetic dataset cyber-physical system adversarial example

2207.10732 Explainable AI Algorithms for Vibration Data-based Fault Detection: Use Case-adadpted Methods and Critical Evaluation

Arxiv Link: http://arxiv.org/abs/2207.10732

Github Link: https://github.com/o-mey/xai-vibration-fault-detection

Authors: Oliver Mey Deniz Neufeld

Tag: deep neural network convolutional neural network neural network synthetic data fourier transform explainable ai

2207.10765 Towards Interpretable Video Super-Resolution via Alternating Optimization

Arxiv Link: http://arxiv.org/abs/2207.10765

Github Link: https://github.com/caojiezhang/davsr

Authors: Jiezhang Cao Jingyun Liang Kai Zhang Wenguan Wang Qin Wang

Tag: motion blur video sequence learning-based method

2207.10816 Mathematical Model of Strong Physically Unclonable Functions Based on Hybrid Boolean Network

Arxiv Link: http://arxiv.org/abs/2207.10816

Github Link: https://github.com/noeloikeau/networkm

Authors: Noeloikeau Charlot Daniel J. Gauthier Daniel Canaday Andrew Pomerance

Tag:

2207.10830 Automated Dilated Spatio-Temporal Synchronous Graph Modeling for Traffic Prediction

Arxiv Link: http://arxiv.org/abs/2207.10830

Github Link: https://github.com/jinguangyin/auto-dstsgn

Authors: Guangyin Jin Fuxian Li Jinlei Zhang Mudan Wang Jincai Huang

Tag: graph convolution source code graph structure representation learning

2207.10852 Spatio-Temporal Deformable Attention Network for Video Deblurring

Arxiv Link: http://arxiv.org/abs/2207.10852

Github Link: https://github.com/huicongzhang/stdan

Authors: Huicong Zhang Haozhe Xie Hongxun Yao

Tag: video frame

2207.10856 Prototype-Guided Continual Adaptation for Class-Incremental Unsupervised Domain Adaptation

Arxiv Link: http://arxiv.org/abs/2207.10856

Github Link: https://github.com/hongbin98/proca

Authors: Hongbin Lin Yifan Zhang Zhen Qiu Shuaicheng Niu Chuang Gan

Tag: unsupervised domain adaptation domain adaptation source code

2207.10869 Optimizing Image Compression via Joint Learning with Denoising

Arxiv Link: http://arxiv.org/abs/2207.10869

Github Link: https://github.com/felixcheng97/denoisecompression

Authors: Ka Leong Cheng Yueqi Xie Qifeng Chen

Tag: source code

2207.10878 An Ensemble Approach for Multiple Emotion Descriptors Estimation Using Multi-task Learning

Arxiv Link: http://arxiv.org/abs/2207.10878

Github Link: https://github.com/tmtvaa/abaw4

Authors: Irfan Haider Minh-Trieu Tran Soo-Hyung Kim Hyung-Jeong Yang Guee-Sang Lee

Tag: attention mechanism

2207.10883 My View is the Best View: Procedure Learning from Egocentric Video

Arxiv Link: http://arxiv.org/abs/2207.10883

Github Link: https://github.com/Sid2697/EgoProceL-egocentric-procedure-learning

Authors: Siddhant Bansal Chetan Arora C.V. Jawahar

Tag: source code

2207.10888 FairGRAPE: Fairness-aware GRAdient Pruning mEthod for Face Attribute Classification

Arxiv Link: http://arxiv.org/abs/2207.10888

Github Link: https://github.com/bernardo1998/fairgrape

Authors: Xiaofeng Lin Seungbae Kim Jungseock Joo

Tag:

2207.10897 Efficient Modeling of Future Context for Image Captioning

Arxiv Link: http://arxiv.org/abs/2207.10897

Github Link: https://github.com/feizc/future-caption

Authors: Zhengcong Fei Junshi Huang Xiaoming Wei Xiaolin Wei

Tag: ms coco source code

2207.10899 Decoupled Adversarial Contrastive Learning for Self-supervised Adversarial Robustne

Arxiv Link: http://arxiv.org/abs/2207.10899

Github Link: https://github.com/pantheon5100/deacl

Authors: Chaoning Zhang Kang Zhang Chenshuang Zhang Axi Niu Jiu Feng

Tag: robust representation contrastive learning representation learning supervised learning adversarial training self-supervised learning

2207.10915 Optimization of Forcemyography Sensor Placement for Arm Movement Recognition

Arxiv Link: http://arxiv.org/abs/2207.10915

Github Link: https://github.com/jerryx1110/iros22-fmg-sensor-optimization

Authors: Xiaohao Xu Zihao Du Huaxin Zhang Ruichao Zhang Zihan Hong

Tag: downstream task optimization algorithm graph theory sensor placement

2207.10931 What's in the laundromat? Mapping and characterising offshore owned domestic property in London

Arxiv Link: http://arxiv.org/abs/2207.10931

Github Link: https://github.com/jonnob/empty_homes_london

Authors: Jonathan Bourne Andrea Ingianni Rex McKenzie

Tag:

2207.10941 Respecting Time Series Properties Makes Deep Time Series Forecasting Perfect

Arxiv Link: http://arxiv.org/abs/2207.10941

Github Link: https://github.com/origamisl/rtnet

Authors: Li Shen Yuning Wei Yangzhu Wang

Tag: benchmark dataset source code time series

2207.10942 fficient Testing of Deep Neural Networks via Decision Boundary Analy

Arxiv Link: http://arxiv.org/abs/2207.10942

Github Link: https://github.com/anony4paper/aries

Authors: Qiang Hu Yuejun Guo Xiaofei Xie Maxime Cordy Lei Ma

Tag: deep neural network decision boundary neural network deep learning application domain

2207.10947 Multilabel Prototype Generation for Data Reduction in k-Nearest Neighbour classification

Arxiv Link: http://arxiv.org/abs/2207.10947

Github Link: https://github.com/jose-jvmas/multilabel_pg

Authors: Jose J. Valero-Mas Antonio Javier Gallego Pablo Alonso-Jiménez Xavier Serra

Tag:

2207.10950 Scale dependant layer for self-supervised nuclei encoding

Arxiv Link: http://arxiv.org/abs/2207.10950

Github Link: https://github.com/peterjacknaylor/scaledependantcnn

Authors: Peter Naylor Yao-Hung Hubert Tsai Marick Laé Makoto Yamada

Tag: supervised learning downstream task unsupervised method self-supervised learning

2207.11001 POP: Mining POtential Performance of new fashion products via webly cross-modal query expansion

Arxiv Link: http://arxiv.org/abs/2207.11001

Github Link: https://github.com/humaticslab/pop-mining-potential-performance

Authors: Christian Joppi Geri Skenderi Marco Cristani

Tag: time series

2207.11012 Fact sheet: Automatic Self-Reported Personality Recognition Track

Arxiv Link: http://arxiv.org/abs/2207.11012

Github Link: https://github.com/gizemsogancioglu/fgm_utrecht

Authors: Francisca Pessanha Gizem Sogancioglu

Tag: personality trait

2207.11018 Learning from what we know: How to perform vulnerability prediction using noisy historical dat

Arxiv Link: http://arxiv.org/abs/2207.11018

Github Link: https://github.com/garghub/trovon

Authors: Aayush Garg Renzo Degiovanni Matthieu Jimenez Maxime Cordy Mike Papadakis

Tag:

2207.11025 ustom Structure Preservation in Face Aging

Arxiv Link: http://arxiv.org/abs/2207.11025

Github Link: https://github.com/guillermogotre/cusp

Authors: Guillermo Gomez-Trenado (1) Stéphane Lathuilière (2) Pablo Mesejo (1) Óscar Cordón (1) ((1) DaSCI research institute DECSAI

Tag: user study pretrained model

2207.11056 Energy-Aware Planning-Scheduling for Autonomous Aerial Robot

Arxiv Link: http://arxiv.org/abs/2207.11056

Github Link: https://github.com/adamseew/energy-planning-paper

Authors: Adam Seewald Héctor García de Marina Henrik Skov Midtiby Ulrik Pagh Schultz

Tag:

2207.11075 RealFlow: EM-based Realistic Optical Flow Dataset Generation from Video

Arxiv Link: http://arxiv.org/abs/2207.11075

Github Link: https://github.com/megvii-research/realflow

Authors: Yunhui Han Kunming Luo Ao Luo Jiangyu Liu Haoqiang Fan

Tag: synthetic dataset video frame

2207.11118 Rethinking the Reference-based Distinctive Image Captioning

Arxiv Link: http://arxiv.org/abs/2207.11118

Github Link: https://github.com/maoyj1998/transdic

Authors: Yangjun Mao Long Chen Zhihong Jiang Dong Zhang Zhimeng Zhang

Tag: reference image

2207.11120 On Controller Tuning with Time-Varying Bayesian Optimization

Arxiv Link: http://arxiv.org/abs/2207.11120

Github Link: https://github.com/brunzema/uitvbo

Authors: Paul Brunzema Alexander von Rohr Sebastian Trimpe

Tag: optimal control bayesian optimization gaussian processes numerical experiment spatial dimension

2207.11163 Adaptive Soft Contrastive Learning

Arxiv Link: http://arxiv.org/abs/2207.11163

Github Link: https://github.com/mrchenfeng/ascl_icpr2022

Authors: Chen Feng Ioannis Patras

Tag: contrastive learning supervised learning representation learning self-supervised learning learning framework

2207.11173 Verifying Fairness in Quantum Machine Learning

Arxiv Link: http://arxiv.org/abs/2207.11173

Github Link: https://github.com/veri-q/fairness

Authors: Ji Guan Wang Fang Mingsheng Ying

Tag: quantum computing decision making

2207.11192 Progressive Deblurring of Diffusion Models for Coarse-to-Fine Image Synthe

Arxiv Link: http://arxiv.org/abs/2207.11192

Github Link: https://github.com/sangyun884/blur-diffusion

Authors: Sangyun Lee Hyungjin Chung Jaehyeon Kim Jong Chul Ye

Tag: inductive bias diffusion model

2207.11221 Domain Generalization for Activity Recognition via Adaptive Feature Fusion

Arxiv Link: http://arxiv.org/abs/2207.11221

Github Link: https://github.com/jindongwang/transferlearning

Authors: Xin Qin Jindong Wang Yiqiang Chen Wang Lu Xinlong Jiang

Tag: domain adaptation domain generalization feature fusion generalization performance

2207.11231 Learning Unsupervised Hierarchies of Audio Concept

Arxiv Link: http://arxiv.org/abs/2207.11231

Github Link: https://github.com/deezer/concept_hierarchy

Authors: Darius Afchar Romain Hennequin Vincent Guigue

Tag: computer vision

2207.11236 Twitmo: A Twitter Data Topic Modeling and Visualization Package fo

Arxiv Link: http://arxiv.org/abs/2207.11236

Github Link: https://github.com/abuchmueller/Twitmo

Authors: Andreas Buchmüller Gillian Kant Christoph Weisser Benjamin Säfken Krisztina Kis-Katos

Tag: latent dirichlet allocation

2207.11237 Defending Substitution-Based Profile Pollution Attacks on Sequential Recommende

Arxiv Link: http://arxiv.org/abs/2207.11237

Github Link: https://github.com/yueeeeeeee/recsys-substitution-defense

Authors: Zhenrui Yue Huimin Zeng Ziyi Kou Lanyu Shang Dong Wang

Tag: recommender system probability distribution adversarial attack convex hull adversarial example adversarial training

2207.11243 Multiface: A Dataset for Neural Face Rendering

Arxiv Link: http://arxiv.org/abs/2207.11243

Github Link: https://github.com/facebookresearch/multiface

Authors: Cheng-hsin Wuu Ningyuan Zheng Scott Ardisson Rohan Bali Danielle Belko

Tag: facial expression human face ablation study

2207.11244 Deep Learning Hyperparameter Optimization for Breast Mass Detection in Mammogram

Arxiv Link: http://arxiv.org/abs/2207.11244

Github Link: https://github.com/aralab-unr/ga-mammograms

Authors: Adarsh Sehgal Muskan Sehgal Hung Manh La George Bebis

Tag: genetic algorithm breast cancer dl model deep learning

Arxiv Link: http://arxiv.org/abs/2207.11321

Github Link: https://github.com/dishashur/log-pagerank

Authors: Disha Shur Yufan Huang David F. Gleich

Tag:

2207.11327 earning from Multiple Annotator Noisy Labels via Sample-wise Label Fusion

Arxiv Link: http://arxiv.org/abs/2207.11327

Github Link: https://github.com/zhengqigao/learning-from-multiple-annotator-noisy-labels

Authors: Zhengqi Gao Fan-Keng Sun Mingran Yang Sucheng Ren Zikai Xiong

Tag: supervised learning deep learning

2207.11335 Generalizing Homophily to Simplicial Complexe

Arxiv Link: http://arxiv.org/abs/2207.11335

Github Link: https://github.com/arnabsarker/simplicialhomophily

Authors: Arnab Sarker Natalie Northrup Ali Jadbabaie

Tag:

2207.11372 Evaluation of Different Annotation Strategies for Deployment of Parking Spaces Classification System

Arxiv Link: http://arxiv.org/abs/2207.11372

Github Link: https://github.com/andrehochuli/pklot-eval-annotations

Authors: Andre G. Hochuli Alceu S. Britto Jr. Paulo R. L. de Almeida Williams B. S. Alves Fabio M. C. Cagni

Tag: bounding box

2207.11463 When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition

Arxiv Link: http://arxiv.org/abs/2207.11463

Github Link: https://github.com/lbh1024/can

Authors: Bohan Li Ye Yuan Dingkang Liang Xiao Liu Zhilong Ji

Tag: benchmark dataset source code attention mechanism

2207.11464 Learning Object Placement via Dual-path Graph Completion

Arxiv Link: http://arxiv.org/abs/2207.11464

Github Link: https://github.com/bcmi/graconet-object-placement

Authors: Siyuan Zhou Liu Liu Li Niu Liqing Zhang

Tag: receptive field

2207.11482 Multimodal Emotion Recognition with Modality-Pairwise Unsupervised Contrastive Lo

Arxiv Link: http://arxiv.org/abs/2207.11482

Github Link: https://github.com/ricfrr/mpuc-mer

Authors: Riccardo Franceschini Enrico Fini Cigdem Beyan Alessandro Conti Federica Arrigoni

Tag: data fusion contrastive loss benchmark dataset emotion recognition supervised learning unsupervised learning

2207.11486 Time Series Prediction under Distribution Shift using Differentiable Forgetting

Arxiv Link: http://arxiv.org/abs/2207.11486

Github Link: https://github.com/jase-clarkson/pods_2022_icml_ts

Authors: Stefanos Bennett Jase Clarkson

Tag: time series distribution shift predictive model

2207.11517 ontrastive Monotonic Pixel-Level Modulation

Arxiv Link: http://arxiv.org/abs/2207.11517

Github Link: https://github.com/lukun199/monopix

Authors: Kun Lu Rongpeng Li Honggang Zhang

Tag: domain adaptation

2207.11518 Online Knowledge Distillation via Mutual Contrastive Learning for Visual Recognition

Arxiv Link: http://arxiv.org/abs/2207.11518

Github Link: https://github.com/winycg/mcl

Authors: Chuanguang Yang Zhulin An Helong Zhou Yongjun Xu Qian Zhan

Tag: mutual information student model contrastive learning intermediate layer representation learning image classification knowledge distillation transfer learning feature representation

2207.11521 Vaccine Discourse on Twitter During the COVID-19 Pandem

Arxiv Link: http://arxiv.org/abs/2207.11521

Github Link: https://github.com/gabriellindelof/vaccine-discourse-on-twitter-during-the-covid-19-pandemic

Authors: Gabriel Lindelöf Talayeh Aledavood Barbara Keller

Tag: support vector machine

2207.11523 Unstructured Road Segmentation using Hypercolumn based Random Forests of Local expert

Arxiv Link: http://arxiv.org/abs/2207.11523

Github Link: https://github.com/prassanna-ravishankar/slither

Authors: Prassanna Ganesh Ravishankar Antonio M. Lopez Gemma M. Sanchez

Tag: random forest convolutional neural network neural network

2207.11530 Kellect: a Kernel-Based Efficient and Lossless Event Log Collecto

Arxiv Link: http://arxiv.org/abs/2207.11530

Github Link: https://github.com/acising/kellect

Authors: Tieming Chen Qijie Song Xuebo Qiu Tiantian Zhu Zhiling Zhu

Tag: event log anomaly detection

2207.11575 Testing the Robustness of Learned Index Structure

Arxiv Link: http://arxiv.org/abs/2207.11575

Github Link: https://github.com/bachfischer/logarithmicerrorregression

Authors: Matthias Bachfischer Renata Borovica-Gajic Benjamin I. P. Rubinstein

Tag: linear regression

2207.11592 Thermal half-lives of azobenzene derivatives: virtual screening based on intersystem crossing using a machine learning potential

Arxiv Link: http://arxiv.org/abs/2207.11592

Github Link: https://github.com/learningmatter-mit/azo_barriers

Authors: Simon Axelrod Eugene Shakhnovich Rafael Gomez-Bombarelli

Tag:

2207.11603 Experience with Abrupt Transition to Remote Teaching of Embedded System

Arxiv Link: http://arxiv.org/abs/2207.11603

Github Link: https://github.com/koniarik/teaching-embedded-remotely

Authors: Jan Koniarik Daniel Dlhopolcek Martin Ukrop

Tag:

2207.11609 Exploring the Impact of Temporal Bias in Point-of-Interest Recommendation

Arxiv Link: http://arxiv.org/abs/2207.11609

Github Link: https://github.com/rahmanidashti/contextsfair

Authors: Hossein A. Rahmani Mohammadmehdi Naghiaei Ali Tourani Yashar Deldjoo

Tag: contextual information recommender system

2207.11637 Explored An Effective Methodology for Fine-Grained Snake Recognition

Arxiv Link: http://arxiv.org/abs/2207.11637

Github Link: https://github.com/aderonhuang/fgvc9_snakeclef2022

Authors: Yong Huang Aderon Huang Wei Zhu Yanming Fang Jinghua Feng

Tag: supervised learning downstream task computer vision self-supervised learning

2207.11640 Reliable amortized variational inference with physics-based latent distribution correction

Arxiv Link: http://arxiv.org/abs/2207.11640

Github Link: https://github.com/slimgroup/reliableavi.jl

Authors: Ali Siahkoohi Gabrio Rizzuti Rafael Orozco Felix J. Herrmann

Tag: deep neural network posterior distribution neural network variational inference bayesian inference gaussian distribution prior distribution distribution shift kullback-leibler divergence

2207.11652 ounterfactual Reasoning for Out-of-distribution Multimodal Sentiment Analy

Arxiv Link: http://arxiv.org/abs/2207.11652

Github Link: https://github.com/teng-sun/clue_model

Authors: Teng Sun Wenjie Wang Liqiang Jing Yiran Cui Xuemeng Song

Tag: sentiment analysis causal inference

2207.11677 Learnable Privacy-Preserving Anonymization for Pedestrian Image

Arxiv Link: http://arxiv.org/abs/2207.11677

Github Link: https://github.com/whuzjw/privacy-reid

Authors: Junwu Zhang Mang Ye Yao Yang

Tag: semantic information privacy protection

2207.11680 No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence

Arxiv Link: http://arxiv.org/abs/2207.11680

Github Link: https://github.com/adf1178/pt4code

Authors: Chaozheng Wang Yuanhang Yang Cuiyun Gao Yun Peng Hongyu Zhang

Tag: downstream task nlp task natural language processing low-resource

2207.11685 Kernel Relative-prototype Spectral Filtering for Few-shot Learning

Arxiv Link: http://arxiv.org/abs/2207.11685

Github Link: https://github.com/zhangtao2022/dsfn

Authors: Tao Zhang Wu Huang

Tag: few-shot learning source code imagenet dataset hilbert space

2207.11717 A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cue

Arxiv Link: http://arxiv.org/abs/2207.11717

Github Link: https://github.com/jasonarmitage-res/pm-vln

Authors: Jason Armitage Leonardo Impett Rico Sennrich

Tag:

2207.11747 Self-dual polyhedral cones and their slack matrices

Arxiv Link: http://arxiv.org/abs/2207.11747

Github Link: https://github.com/bflourenco/self_dual_slacks

Authors: João Gouveia Bruno F. Lourenço

Tag:

2207.11759 Spatial-Temporal Federated Learning for Lifelong Person Re-identification on Distributed Edge

Arxiv Link: http://arxiv.org/abs/2207.11759

Github Link: https://github.com/MSNLAB/Federated-Lifelong-Person-ReID

Authors: Lei Zhang Guanyu Gao Huaizheng Zhang

Tag: communication cost federated learning lifelong learning

2207.11767 napshot Metrics Are Not Enough: Analyzing Software Repositories with Longitudinal Met

Arxiv Link: http://arxiv.org/abs/2207.11767

Github Link: https://github.com/softwaresystemslaboratory/prime

Authors: Nicholas Synovic Matt Hyatt Rohan Sethi Sohini Thota Shilpika

Tag: source code

2207.11769 CODiT: Conformal Out-of-Distribution Detection in Time-Series Dat

Arxiv Link: http://arxiv.org/abs/2207.11769

Github Link: https://github.com/kaustubhsridhar/time-series-ood

Authors: Ramneet Kaur Kaustubh Sridhar Sangdon Park Susmit Jha Anirban Roy

Tag: computer vision training distribution safety-critical application time-series data autonomous driving anomaly detection ood detection autonomous vehicle

2207.11805 Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Action

Arxiv Link: http://arxiv.org/abs/2207.11805

Github Link: https://github.com/lizhi1104/haan

Authors: Zhi Li Lu He Huijuan Xu

Tag: human behavior visual representation

2207.11814 Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism

Arxiv Link: http://arxiv.org/abs/2207.11814

Github Link: https://github.com/md-mohaiminul/objectstatechange

Authors: Md Mohaiminul Islam Gedas Bertasius

Tag: ablation study attention mechanism negative example

2207.12405 Versatile Weight Attack via Flipping Limited Bit

Arxiv Link: http://arxiv.org/abs/2207.12405

Github Link: https://github.com/jiawangbai/versatile-weight-attack

Authors: Jiawang Bai Baoyuan Wu Zhifeng Li Shu-tao Xia

Tag: deep neural network continuous optimization adversarial attack neural network

2207.12409 Automated discovery of interpretable gravitational-wave population model

Arxiv Link: http://arxiv.org/abs/2207.12409

Github Link: https://github.com/kazewong/symbolicgwpopulation_paper

Authors: Kaze W.K Wong Miles Cranmer

Tag: gaussian mixture model

2207.12496 NeuriCam: Video Super-Resolution and Colorization Using Key Frame

Arxiv Link: http://arxiv.org/abs/2207.12496

Github Link: https://github.com/vb000/neuricam

Authors: Bandhav Veluri Ali Saffari Collin Pernu Joshua Smith Michael Taylor

Tag: source code neural network feature map real-time energy consumption

2207.12534 Trainability Preserving Neural Structured Pruning

Arxiv Link: http://arxiv.org/abs/2207.12534

Github Link: https://github.com/mingsun-tse/tpp

Authors: Huan Wang Yun Fu

Tag: deep neural network deep network neural network

2207.12538 Bayesian tensor factorization for predicting clinical outcomes using integrated human genetics evidence

Arxiv Link: http://arxiv.org/abs/2207.12538

Github Link: https://github.com/cx0/icml-human-genetics

Authors: Onuralp Soylemez

Tag: success rate

2207.12570 Seeing Far in the Dark with Patterned Flash

Arxiv Link: http://arxiv.org/abs/2207.12570

Github Link: https://github.com/zhsun0357/seeing-far-in-the-dark-with-patterned-flash

Authors: Zhanghao Sun Jian Wang Yicheng Wu Shree Nayar

Tag: depth estimation image reconstruction convolutional neural network neural network

2207.12577 ompiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution

Arxiv Link: http://arxiv.org/abs/2207.12577

Github Link: https://github.com/wuyushuwys/compiler-aware-nas-sr

Authors: Yushu Wu Yifan Gong Pu Zhao Yanyu Li Zheng Zhan

Tag: neural architecture search power consumption application scenario real-time image quality deep learning neural architecture

2207.12600 earning Protein Representations via Complete 3D Graph Network

Arxiv Link: http://arxiv.org/abs/2207.12600

Github Link: https://github.com/divelab/DIG

Authors: Limei Wang Haoran Liu Yi Liu Jerry Kurtin Shuiwang Ji

Tag: downstream task representation learning

2207.12620 Large-displacement 3D Object Tracking with Hybrid Non-local Optimization

Arxiv Link: http://arxiv.org/abs/2207.12620

Github Link: https://github.com/cvbubbles/nonlocal-3dtracking

Authors: Xuhui Tian Xinran Lin Fan Zhong Xueying Qin

Tag: real-time source code

2207.12628 Bundle MCR: Towards Conversational Bundle Recommendation

Arxiv Link: http://arxiv.org/abs/2207.12628

Github Link: https://github.com/aaronheee/bundle-mcr

Authors: Zhankui He Handong Zhao Tong Yu Sungchul Kim Fan Du

Tag: user preference markov decision process user interaction recommender system

2207.12646 Learning Hierarchy Aware Features for Reducing Mistake Severity

Arxiv Link: http://arxiv.org/abs/2207.12646

Github Link: https://github.com/07agarg/haf

Authors: Ashima Garg Depanshu Sani Saket Anand

Tag: source code

2207.12660 Bilateral Self-unbiased Learning from Biased Implicit Feedback

Arxiv Link: http://arxiv.org/abs/2207.12660

Github Link: https://github.com/jaewoong-lee/sigir_2022_biser

Authors: Jae-woong Lee Seongmin Park Joonseok Lee Jongwuk Lee

Tag: recommender system

2207.12750 SPAIC: A Spike-based Artificial Intelligence Computing Framework

Arxiv Link: http://arxiv.org/abs/2207.12750

Github Link: https://github.com/ZhejianglabNCRC/SPAIC

Authors: Chaofei Hong Mengwen Yuan Mengxiao Zhang Xiao Wang Chegnjun Zhang

Tag: artificial intelligence deep learning neural network

2207.12762 Productivity meets Performance: Julia on A64F

Arxiv Link: http://arxiv.org/abs/2207.12762

Github Link: https://github.com/giordano/julia-on-fugaku

Authors: Mosè Giordano Milan Klöwer Valentin Churavy

Tag: programming language

2207.12767 teria Comparative Learning for Real-scene Image Super-Resolution

Arxiv Link: http://arxiv.org/abs/2207.12767

Github Link: https://github.com/house-leo/realsr-zero

Authors: Yukai Shi Hao Li Sen Zhang Zhijing Yang Xiao Wang

Tag: low-resolution image computer vision image patch vision task contrastive loss contrastive learning

2207.12819 S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning

Arxiv Link: http://arxiv.org/abs/2207.12819

Github Link: https://github.com/google-research/l2p

Authors: Yabin Wang Zhiwu Huang Xiaopeng Hong

Tag: deep neural network continual learning neural network

2207.12831 felong DP: Consistently Bounded Differential Privacy in Lifelong Machine Learning

Arxiv Link: http://arxiv.org/abs/2207.12831

Github Link: https://github.com/haiphanNJIT/PrivateDeepLearning

Authors: Phung Lai Han Hu NhatHai Phan Ruoming Jin My T. Thai

Tag: differential privacy

2207.12850 Towards Smart City Security: Violence and Weaponized Violence Detection using DCNN

Arxiv Link: http://arxiv.org/abs/2207.12850

Github Link: https://github.com/Ti-Oluwanimi/Violence_Detection

Authors: Toluwani Aremu Li Zhiyuan Reem Alameeri Moayad Aloqaily Mohsen Guizani

Tag: real-time convolutional neural network video frame neural network

2207.12859 ually explaining 3D-CNN predictions for video classification with an adaptive occlusion sensitivity analy

Arxiv Link: http://arxiv.org/abs/2207.12859

Github Link: https://github.com/uchiyama33/aosa

Authors: Tomoki Uchiyama Naoya Sogi Koichiro Niinuma Kazuhiro Fukui

Tag: convolutional neural network neural network decision-making process sensitivity analysis image classification video data

2207.12876 Repeated Environment Inference for Invariant Learning

Arxiv Link: http://arxiv.org/abs/2207.12876

Github Link: https://github.com/aamixsh/reiil

Authors: Aayush Mishra Anqi Liu

Tag:

2207.12877 Representing Random Utility Choice Models with Neural Network

Arxiv Link: http://arxiv.org/abs/2207.12877

Github Link: https://github.com/antoinedesir/rumnet

Authors: Ali Aouad Antoine Désir

Tag: utility function deep learning neural network

2207.12883 Enhancing Collaborative Filtering Recommender with Prompt-Based Sentiment Analy

Arxiv Link: http://arxiv.org/abs/2207.12883

Github Link: https://github.com/hhhhzy/nlu_project

Authors: Elliot Dang Zheyuan Hu Tong Li

Tag: sentiment analysis nlp model prompt-based learning

2207.12899 ment of a cost-effective headphone calibration procedure for soundscape evaluation

Arxiv Link: http://arxiv.org/abs/2207.12899

Github Link: https://github.com/ntudsp/ica22-calibration

Authors: Bhan Lam Kenneth Ooi Zhen-Ting Ong Karn N. Watcharasupat Trevor Wong

Tag:

2207.12901 oss-Modality Image Registration using a Training-Time Privileged Third Modality

Arxiv Link: http://arxiv.org/abs/2207.12901

Github Link: https://github.com/qianyeyang/mpmrireg

Authors: Qianye Yang David Atkinson Yunguan Fu Tom Syer Wen Yan

Tag: learning-based method

2207.12906 Searching on the boundary of abundance for odd weird numbers

Arxiv Link: http://arxiv.org/abs/2207.12906

Github Link: https://github.com/fwjmath/ows-data

Authors: Wenjie Fang

Tag:

2207.12934 A Reliable Online Method for Joint Estimation of Focal Length and Camera Rotation

Arxiv Link: http://arxiv.org/abs/2207.12934

Github Link: https://github.com/elderlab-york-university/onlinefr

Authors: Yiming Qian James H. Elder

Tag: camera pose

2207.12980 Efficient One Pass Self-distillation with Zipf's Label Smoothing

Arxiv Link: http://arxiv.org/abs/2207.12980

Github Link: https://github.com/megvii-research/zipfls

Authors: Jiajun Liang Linze Li Zhaodong Bing Borui Zhao Yao Tang

Tag:

2207.12994 \(^2\)L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval

Arxiv Link: http://arxiv.org/abs/2207.12994

Github Link: https://github.com/wangwenhao0716/v2l

Authors: Wenhao Wang Yifan Sun Zongxin Yang Yi Yang

Tag: language model vision model

2207.13005 Hansel: A Chinese Few-Shot and Zero-Shot Entity Linking Benchmark

Arxiv Link: http://arxiv.org/abs/2207.13005

Github Link: https://github.com/imryanxu/hansel

Authors: Zhenran Xu Zifei Shan Yuxin Li Baotian Hu Bing Qin

Tag: knowledge base

2207.13038 Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Model

Arxiv Link: http://arxiv.org/abs/2207.13038

Github Link: https://github.com/compvis/latent-diffusion

Authors: Robin Rombach Andreas Blattmann Björn Ommer

Tag: source code synthesized image diffusion model

2207.13048 Domain Adaptation under Open Set Label Shift

Arxiv Link: http://arxiv.org/abs/2207.13048

Github Link: https://github.com/acmi-lab/open-set-label-shift

Authors: Saurabh Garg Sivaraman Balakrishnan Zachary C. Lipton

Tag: domain adaptation linear model

2207.13061 NewsStories: Illustrating articles with visual summarie

Arxiv Link: http://arxiv.org/abs/2207.13061

Github Link: https://github.com/newsstoriesdata/newsstories.github.io

Authors: Reuben Tan Bryan A. Plummer Kate Saenko JP Lewis Avneesh Sud

Tag: news article

2207.12194 omain Decorrelation with Potential Energy Ranking

Arxiv Link: http://arxiv.org/abs/2207.12194

Github Link: https://github.com/foreverps/poer

Authors: Sen Pei Jiaxi Sun Shiming Xiang Gaofeng Meng

Tag: computer vision vision task neural network domain shift domain generalization deep learning

2207.12377 Confident Deep Learning loss function for one-step Conformal Prediction approximation

Arxiv Link: http://arxiv.org/abs/2207.12377

Github Link: https://github.com/juliameister/dl-confident-loss-function

Authors: Julia A. Meister Khuong An Nguyen Stelios Kapetanakis Zhiyuan Luo

Tag: large-scale dataset mnist dataset deep learning

2207.13091 VDL-Surrogate: A View-Dependent Latent-based Model for Parameter Space Exploration of Ensemble Simulation

Arxiv Link: http://arxiv.org/abs/2207.13091

Github Link: https://github.com/trainsn/vdl-surrogate

Authors: Neng Shi Jiayi Xu Hanqi Guo Jonathan Woodring Han-Wei Shen

Tag: source code latent representation surrogate model

2207.13129 V: Boosting Adversarial Example Transferability from Large Geometric Vicinity

Arxiv Link: http://arxiv.org/abs/2207.13129

Github Link: https://github.com/framartin/lgv-geometric-transferability

Authors: Martin Gubri Maxime Cordy Mike Papadakis Yves Le Traon Koushik Sen

Tag: adversarial attack adversarial example surrogate model

2207.13159 TINYCD: A (Not So) Deep Learning Model For Change Detection

Arxiv Link: http://arxiv.org/abs/2207.13159

Github Link: https://github.com/andreacodegoni/tiny_model_4_cd

Authors: Andrea Codegoni Gabriele Lombardi Alessandro Ferrari

Tag: source code lighting condition real-time deep learning time domain spatial feature

2207.13176 Exploring the Unprecedented Privacy Risks of the Metaverse

Arxiv Link: http://arxiv.org/abs/2207.13176

Github Link: https://github.com/metaguard/metadata

Authors: Vivek Nair Gonzalo Munilla Garrido Dawn Song

Tag:

2207.13179 Unsupervised Learning under Latent Label Shift

Arxiv Link: http://arxiv.org/abs/2207.13179

Github Link: https://github.com/manleyroberts/lls-ddfa

Authors: Manley Roberts Pranav Mani Saurabh Garg Zachary C. Lipton

Tag: matrix factorization unsupervised learning distribution shift

2207.13227 Atomic structure generation from reconstructing structural fingerp

Arxiv Link: http://arxiv.org/abs/2207.13227

Github Link: https://github.com/fung-lab/structrepgen

Authors: Victor Fung Shuyi Jia Jiaxin Zhang Sirui Bi Junqi Yin

Tag: generative model variational autoencoder

2207.13235 Mid-level Representation Enhancement and Graph Embedded Uncertainty Suppressing for Facial Expression Recognition

Arxiv Link: http://arxiv.org/abs/2207.13235

Github Link: https://github.com/cruiseyugh/gus

Authors: Jie Lei Zhao Liu Zeyu Zou Tong Li Xu Juan

Tag: facial expression representation learning

2207.13247 oncurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation

Arxiv Link: http://arxiv.org/abs/2207.13247

Github Link: https://github.com/val-iisc/stickerda

Authors: Jogendra Nath Kundu Suvaansh Bhambri Akshay Kulkarni Hiran Sarkar Varun Jampani

Tag: domain shift domain adaptation pretext task unsupervised domain adaptation

2207.13254 ontextual Information and Commonsense Based Prompt for Emotion Recognition in Conversation

Arxiv Link: http://arxiv.org/abs/2207.13254

Github Link: https://github.com/deqingyang/cisper

Authors: Jingjie Yi Deqing Yang Siyu Yuan Caiyan Cao Zhiyao Zhang

Tag: emotion recognition language model contextual information

2207.13307 Marker and source-marker reprogramming of Most Permissive Boolean networks and ensembles with BoN

Arxiv Link: http://arxiv.org/abs/2207.13307

Github Link: https://github.com/bnediction/reprogramming-with-bonesis

Authors: Loïc Paulevé

Tag: fixed point dynamical system

2207.13320 Generator Knows What Discriminator Should Learn in Unconditional GAN

Arxiv Link: http://arxiv.org/abs/2207.13320

Github Link: https://github.com/naver-ai/ggdr

Authors: Gayoung Lee Hyunsu Kim Junho Kim Seonghyeon Kim Jung-Woo Ha

Tag: u-net feature map semantic representation

2207.13325 SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding

Arxiv Link: http://arxiv.org/abs/2207.13325

Github Link: https://github.com/qumengxue/siri-vg

Authors: Mengxue Qu Yu Wu Wu Liu Qiqi Gong Xiaodan Liang

Tag:

2207.13353 One-Trimap Video Matting

Arxiv Link: http://arxiv.org/abs/2207.13353

Github Link: https://github.com/hongje/otvm

Authors: Hongje Seong Seoung Wug Oh Brian Price Euntai Kim Joon-Young Lee

Tag: source code information flow

2207.13374 Efficient Video Deblurring Guided by Motion Magnitude

Arxiv Link: http://arxiv.org/abs/2207.13374

Github Link: https://github.com/sollynoay/mmp-rnn

Authors: Yusheng Wang Yunfan Lu Ye Gao Lin Wang Zhihang Zhong

Tag: motion blur recurrent neural network neural network

2207.13378 Identifying Hard Noise in Long-Tailed Sample Distribution

Arxiv Link: http://arxiv.org/abs/2207.13378

Github Link: https://github.com/yxymessi/h2e-framework

Authors: Xuanyu Yi Kaihua Tang Xian-Sheng Hua Joo-Hwee Lim Hanwang Zhang

Tag: training distribution learning framework

2207.13390 olutionary Multiparty Distance Minimization

Arxiv Link: http://arxiv.org/abs/2207.13390

Github Link: https://github.com/milabhitsz/2022sheoptmpnds3

Authors: Zeneng She Wenjian Luo Xin Lin Yatong Chang Yuhui Shi

Tag: minimization problem

2207.13417 Hardly Perceptible Trojan Attack against Neural Networks with Bit Flip

Arxiv Link: http://arxiv.org/abs/2207.13417

Github Link: https://github.com/jiawangbai/hpt

Authors: Jiawang Bai Kuofeng Gao Dihong Gong Shu-Tao Xia Zhifeng Li

Tag: deep neural network additive noise optimization algorithm neural network imagenet dataset

2207.13424 Efficient Pix2Vox++ for 3D Cardiac Reconstruction from 2D echo view

Arxiv Link: http://arxiv.org/abs/2207.13424

Github Link: https://github.com/david-stojanovski/e-pix2vox-reconstruction

Authors: David Stojanovski Uxio Hermida Marica Muffoletto Pablo Lamata Arian Beqiri

Tag:

2207.13428 Leveraging GAN Priors for Few-Shot Part Segmentation

Arxiv Link: http://arxiv.org/abs/2207.13428

Github Link: https://github.com/hanmengya1996/pftgan

Authors: Mengya Han Heliang Zheng Chaoyue Wang Yong Luo Han Hu

Tag: downstream task auto-encoder

2207.13475 PASTA-GAN++: A Versatile Framework for High-Resolution Unpaired Virtual Try-on

Arxiv Link: http://arxiv.org/abs/2207.13475

Github Link: https://github.com/xiezhy6/pasta-gan-plusplus

Authors: Zhenyu Xie Zaiyu Huang Fuwei Zhao Haoye Dong Michael Kampffmeyer

Tag:

2207.13479 AutoTransition: Learning to Recommend Video Transition Effect

Arxiv Link: http://arxiv.org/abs/2207.13479

Github Link: https://github.com/acherstyx/autotransition

Authors: Yaojie Shen Libo Zhang Kai Xu Xiaojie Jin

Tag: user study

2207.13513 earning with Combinatorial Optimization Layers: a Probabilistic Approach

Arxiv Link: http://arxiv.org/abs/2207.13513

Github Link: https://github.com/axelparmentier/inferopt.jl

Authors: Guillaume Dalle Léo Baty Louis Bouvier Axel Parmentier

Tag: stochastic gradient stochastic gradient descent gradient descent optimization algorithm

2207.13543 Abstracting Sketches through Simple Primitive

Arxiv Link: http://arxiv.org/abs/2207.13543

Github Link: https://github.com/explainableml/sketch-primitives

Authors: Stephan Alaniz Massimiliano Mancini Anjan Dutta Diego Marcos Zeynep Akata

Tag: image retrieval

2207.13583 Towards the Neuroevolution of Low-level Artificial General Intelligence

Arxiv Link: http://arxiv.org/abs/2207.13583

Github Link: https://github.com/socratesnfr/neat-nagi-python

Authors: Sidney Pontes-Filho Kristoffer Olsen Anis Yazidi Michael A. Riegler Pål Halvorsen

Tag: curriculum learning network topology artificial neural network neural network

2207.13674 Fast expansion into harmonics on the disk: a steerable basis with fast radial convolutions

Arxiv Link: http://arxiv.org/abs/2207.13674

Github Link: https://github.com/nmarshallf/fle_2d

Authors: Nicholas F. Marshall Oscar Mickelin Amit Singer

Tag:

2207.13676 Open Source Vizier: Distributed Infrastructure and API for Reliable and Flexible Blackbox Optimization

Arxiv Link: http://arxiv.org/abs/2207.13676

Github Link: https://github.com/google/vizier

Authors: Xingyou Song Sagi Perel Chansoo Lee Greg Kochanski Daniel Golovin

Tag: transfer learning distributed system

2207.13686 Shift-tolerant Perceptual Similarity Met

Arxiv Link: http://arxiv.org/abs/2207.13686

Github Link: https://github.com/abhijay9/shifttolerant-lpips

Authors: Abhijay Ghildyal Feng Liu

Tag: deep neural network reference image neural network

2207.13702 Physical Systems Modeled Without Physical Law

Arxiv Link: http://arxiv.org/abs/2207.13702

Github Link: https://github.com/samchyams/physicsmodelingml

Authors: David Noever Samuel Hyams

Tag:

2207.13738 Break and Make: Interactive Structural Understanding Using LEGO Brick

Arxiv Link: http://arxiv.org/abs/2207.13738

Github Link: https://github.com/aaronwalsman/ltron

Authors: Aaron Walsman Muru Zhang Klemen Kotar Karthik Desingh Ali Farhadi

Tag: geometric structure

2207.13751 GAUDI: A Neural Architect for Immersive 3D Scene Generation

Arxiv Link: http://arxiv.org/abs/2207.13751

Github Link: https://github.com/apple/ml-gaudi

Authors: Miguel Angel Bautista Pengsheng Guo Samira Abnar Walter Talbott Alexander Toshev

Tag: camera pose generative model latent representation

2207.13820 oss-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transforme

Arxiv Link: http://arxiv.org/abs/2207.13820

Github Link: https://github.com/postech-ami/fastmetro

Authors: Junhyeong Cho Kim Youwang Tae-Hyun Oh

Tag: transformer encoder encoder-decoder architecture

2207.13827 Declarative Smart Contract

Arxiv Link: http://arxiv.org/abs/2207.13827

Github Link: https://github.com/haoxianchen/declarative-smart-contracts

Authors: Haoxian Chen Gerald Whitters Mohammad Javad Amiri Yuepeng Wang Boon Thau Loo

Tag: programming language

2207.13843 eep Learning-Based Acoustic Mosquito Detection in Noisy Conditions Using Trainable Kernels and Augmentation

Arxiv Link: http://arxiv.org/abs/2207.13843

Github Link: https://github.com/seancampos/compare2022_vecnet

Authors: Devesh Khandelwal Sean Campos Shwetha Nagaraj Fred Nugen Alberto Todeschini

Tag: deep learning audio signal

2207.13845 EEG2Mel: Reconstructing Sound from Brain Responses to Mu

Arxiv Link: http://arxiv.org/abs/2207.13845

Github Link: https://github.com/AGRamirezz/Decoding-Neural-Activity-to-Reconstruct-and-Generate-Music-Stimuli

Authors: Adolfo G. Ramirez-Aristizabal Chris Kello

Tag: information retrieval convolutional neural network neural network success rate deep learning mel-spectrogram

2207.13848 Predicting the Output Structure of Sparse Matrix Multiplication with Sampled Compression Ratio

Arxiv Link: http://arxiv.org/abs/2207.13848

Github Link: https://github.com/lorentzbf/size-prediction

Authors: Zhaoyang Du Yijin Guan Tianchan Guan Dimin Niu Nianxiong Tan

Tag: matrix multiplication

2207.13870 Engineering faster double-array Aho-Corasick automat

Arxiv Link: http://arxiv.org/abs/2207.13870

Github Link: https://github.com/daac-tools/daachorse

Authors: Shunsuke Kanda Koichi Akabe Yusuke Oda

Tag:

2207.13888 Utterance-by-utterance overlap-aware neural diarization with Graph-PIT

Arxiv Link: http://arxiv.org/abs/2207.13888

Github Link: https://github.com/fgnt/graph_pit

Authors: Keisuke Kinoshita Thilo von Neumann Marc Delcroix Christoph Boeddeker Reinhold Haeb-Umbach

Tag: neural network

2207.13921 HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative

Arxiv Link: http://arxiv.org/abs/2207.13921

Github Link: https://github.com/PaddlePaddle/PaddleHelix

Authors: Xiaomin Fang Fan Wang Lihang Liu Jingzhou He Dayong Lin

Tag: supervised learning language model self-supervised learning

2207.13989 Folding Polyiamonds into Octahed

Arxiv Link: http://arxiv.org/abs/2207.13989

Github Link: https://github.com/dasnessie/folding-polyiamonds

Authors: Eva Stehr Linda Kleist

Tag: polynomial time

2207.14000 Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation

Arxiv Link: http://arxiv.org/abs/2207.14000

Github Link: https://github.com/strong-ai-lab/multi-step-deductive-reasoning-over-natural-language

Authors: Qiming Bao Alex Yuxuan Peng Tim Hartill Neset Tan Zhenyun Deng

Tag: source code attention mechanism deep learning neural network

2207.14192 Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection

Arxiv Link: http://arxiv.org/abs/2207.14192

Github Link: https://github.com/enlighten0707/body-part-map-for-interactiveness

Authors: Xiaoqian Wu Yong-Lu Li Xinpeng Liu Junyi Zhang Yuzhe Wu

Tag: self-attention

2207.14227 ual Recognition by Request

Arxiv Link: http://arxiv.org/abs/2207.14227

Github Link: https://github.com/chufengt/Visual-Recognition-by-Request

Authors: Chufeng Tang Lingxi Xie Xiaopeng Zhang Xiaolin Hu Qi Tian

Tag: recognition system knowledge base

2207.14252 Improving the Performance of Robust Control through Event-Triggered Learning

Arxiv Link: http://arxiv.org/abs/2207.14252

Github Link: https://github.com/avrohr/betl

Authors: Alexander von Rohr Friedrich Solowjow Sebastian Trimpe

Tag: generating function numerical example model uncertainty learning-based method

2207.14288 Rewriting Geometric Rules of a GAN

Arxiv Link: http://arxiv.org/abs/2207.14288

Github Link: https://github.com/peterwang512/ganwarping

Authors: Sheng-Yu Wang David Bau Jun-Yan Zhu

Tag: large-scale dataset generative model latent space deep generative model

YouTube

BiliBili

In last episode, we finished up minimax strategy for Connect-N games, including Tic-Tac-Toe and Gomoku. This episode, we will implement its GUI environment based on Pygame library for human vs. human, AI vs. AI or human vs. AI plays, which is essential for self-play AlphaGo Zero reinforcement learning. The environment is further embedded into OpenAI Gym as it's the standard in game reinforcement learning. All code in this series is in ConnectNGym github.

Connect-N Pygame Implementation

Pygame Tic-Tac-Toe Human Play

Python has several well-known multi-platform GUI libraries such as Tkinter, PyQt. They are mainly targeted at desktop GUI programming, whose API family is complicated and learning curve is steep. In contrast, Pygame is tailored specifically for desktop small game development so we adopt it. ### Pygame 101

Pygame is, no exceptionally, the same as all GUI development, that is based on single thread event driven model. Here is the simplest desktop Pygame application showing a window. while True infinitely retrieves events dispatched by OS to the window. In the example, we only handle quit event (user clicking on close button) to exit the whole process. In addition, clock variable controls FPS, which we won't elaborate on.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
import sys
import pygame
pygame.init()
display = pygame.display.set_mode((800,600))
clock = pygame.time.Clock()

while True:
for event in pygame.event.get():
if event.type == pygame.QUIT:
sys.exit(0)
else:
pygame.display.update()
clock.tick(1)

PyGameBoard Class

PyGameBoard class encapsulates GUI interaction and rendering logics. In last episode, we have coded ConnectNGame class. PyGameBoard is instantiated with a pre-initialized ConnectNGame instance. It handles GUI mouse event to determine next valid move and then further manipulates its internal state, which is just the ConnectNGame instance passed in. Concretely, PyGameBoard instance method, next_user_input(self) loops until a valid action is identified by current player.

PyGameBoard Class Diagram
{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class PyGameBoard:

def __init__(self, connectNGame: ConnectNGame):
self.connectNGame = connectNGame
pygame.init()

def next_user_input(self) -> Tuple[int, int]:
self.action = None
while not self.action:
self.check_event()
self._render()
self.clock.tick(60)
return self.action

def move(self, r: int, c: int) -> int:
return self.connectNGame.move(r, c)

if __name__ == '__main__':
connectNGame = ConnectNGame()
pygameBoard = PyGameBoard(connectNGame)
while not pygameBoard.isGameOver():
pos = pygameBoard.next_user_input()
pygameBoard.move(*pos)

pygame.quit()

Following Pygame 101, method check_event() handles events dispatched by OS and only player mouse event is consumed. Method _handle_user_input() converts mouse event into row and column indices, validates the move and returns the move in the form of Tuple[int, int]. For instance, (0, 0) is the upper left corner position.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def check_event(self):
for e in pygame.event.get():
if e.type == pygame.QUIT:
pygame.quit()
sys.exit(0)
elif e.type == pygame.MOUSEBUTTONDOWN:
self._handle_user_input(e)

def _handle_user_input(self, e: Event) -> Tuple[int, int]:
origin_x = self.start_x - self.edge_size
origin_y = self.start_y - self.edge_size
size = (self.board_size - 1) * self.grid_size + self.edge_size * 2
pos = e.pos
if origin_x <= pos[0] <= origin_x + size and origin_y <= pos[1] <= origin_y + size:
if not self.connectNGame.gameOver:
x = pos[0] - origin_x
y = pos[1] - origin_y
r = int(y // self.grid_size)
c = int(x // self.grid_size)
valid = self.connectNGame.checkAction(r, c)
if valid:
self.action = (r, c)
return self.action

Integrated into OpenAI Gym

OpenAI Gym specifies how Agent interacts with Env. Env is defined as gym.Env and the major task of creating a new game Environment is subclassing it and overriding reset, step and render methods. Let's see how our ConnectNGym looks like.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
class ConnectNGym(gym.Env):

def reset(self) -> ConnectNGame:
"""Resets the state of the environment and returns an initial observation.

Returns:
observation (object): the initial observation.
"""
raise NotImplementedError


def step(self, action: Tuple[int, int]) -> Tuple[ConnectNGame, int, bool, None]:
"""Run one timestep of the environment's dynamics. When end of
episode is reached, you are responsible for calling `reset()`
to reset this environment's state.

Accepts an action and returns a tuple (observation, reward, done, info).

Args:
action (object): an action provided by the agent

Returns:
observation (object): agent's observation of the current environment
reward (float) : amount of reward returned after previous action
done (bool): whether the episode has ended, in which case further step() calls will return undefined results
info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
"""
raise NotImplementedError



def render(self, mode='human'):
"""
Renders the environment.

The set of supported modes varies per environment. (And some
environments do not support rendering at all.) By convention,
if mode is:

- human: render to the current display or terminal and
return nothing. Usually for human consumption.
- rgb_array: Return an numpy.ndarray with shape (x, y, 3),
representing RGB values for an x-by-y pixel image, suitable
for turning into a video.
- ansi: Return a string (str) or StringIO.StringIO containing a
terminal-style text representation. The text can include newlines
and ANSI escape sequences (e.g. for colors).

Note:
Make sure that your class's metadata 'render.modes' key includes
the list of supported modes. It's recommended to call super()
in implementations to use the functionality of this method.

Args:
mode (str): the mode to render with
"""
raise NotImplementedError

Method reset()

1
def reset(self) -> ConnectNGame

Resets environment internal state and returns corresponding initial status that can be observed by agent. ConnectNGym holds an instance of ConnectNGame as its internal state and because of the complete observability property in any board games, the observable state by agent is exactly the same as board game internal state. So we return a deepcopy of ConnectNGame instance in reset().

Method step()

1
def step(self, action: Tuple[int, int]) -> Tuple[ConnectNGame, int, bool, None]

Once the agent selects an action and hands back to env, env would execute the action and change its internal state via step() and returns following four items.

  1. The new state observed by the agent
  2. The reward associated with the action
  3. Environment terminated or not
  4. Other auxiliary information

step() is the most core API of gym.Env. We illustrate a sequence of game state transitions together with input and output

Initial State:
State ((0, 0, 0), (0, 0, 0), (0, 0, 0))

Agent A selects action = (0, 0). ConnectNGym.step() executes the action and returns

status = ((1, 0, 0), (0, 0, 0), (0, 0, 0)),reward = 0,game_end = False

State ((1, 0, 0), (0, 0, 0), (0, 0, 0))

Agent B selects action = (1, 1),ConnectNGym.step() executes the action and returns

status = ((1, 0, 0), (0, -1, 0), (0, 0, 0)),reward = 0,game_end = False

State ((1, 0, 0), (0, -1, 0), (0, 0, 0))
Repeat the process and game may end up with following terminal state after 5 rounds.
Terminal State ((1, 1, 1), (-1, -1, 0), (0, 0, 0))

The last step() returns

status = ((1, 1, 1), (-1, -1, 0), (0, 0, 0)),reward = 1,game_end = True

Method render()

1
def render(self, mode='human')

Renders the env. Mode parameter differentiates player being human or AI agent.

ConnectNGym Implementation

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class ConnectNGym(gym.Env):

def __init__(self, pygameBoard: PyGameBoard, isGUI=True, displaySec=2):
self.pygameBoard = pygameBoard
self.isGUI = isGUI
self.displaySec = displaySec
self.action_space = spaces.Discrete(pygameBoard.board_size * pygameBoard.board_size)
self.observation_space = spaces.Discrete(pygameBoard.board_size * pygameBoard.board_size)
self.seed()
self.reset()

def reset(self) -> ConnectNGame:
self.pygameBoard.connectNGame.reset()
return copy.deepcopy(self.pygameBoard.connectNGame)

def step(self, action: Tuple[int, int]) -> Tuple[ConnectNGame, int, bool, None]:
# assert self.action_space.contains(action)

r, c = action
reward = REWARD_NONE
result = self.pygameBoard.move(r, c)
if self.pygameBoard.isGameOver():
reward = result

return copy.deepcopy(self.pygameBoard.connectNGame), reward, not result is None, None

def render(self, mode='human'):
if not self.isGUI:
self.pygameBoard.connectNGame.drawText()
time.sleep(self.displaySec)
else:
self.pygameBoard.display(sec=self.displaySec)

def get_available_actions(self) -> List[Tuple[int, int]]:
return self.pygameBoard.getAvailablePositions()

Connect-N Enhanced Minimax Strategy

The following animation shows two minimax AI players playing Tic-Tac-Toe game (k=3,m=n=3). We know the conclusion from previous episode that Tic-Tac-Toe is solved to be a draw, meaning when two players both play optimal strategy, the first player is forced tie by second one, which corresponds to animation result.

Minimax AI Self-Play

Game State Rotation Enhancement

In last episode, we have confirmed Tic-Tac-Toe has 5478 total states. The number grows exponentially as k, m and n increase. For instance, in case where k=3, m=n=4 the total state number is 6035992 whereas k=4, m=n=4 it's 9722011. We could improve Minimax DP strategy by pruning those game states that are rotated from one solved game state. That is, once a game state is solved, we not only cache this game state but also cache other three game states derived by rotation that share the same result.

For example, game state below has same result as other three rotated ones.

Game State 1
Other Rotated 3 States
{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def similarStatus(self, status: Tuple[Tuple[int, ...]]) -> List[Tuple[Tuple[int, ...]]]:
ret = []
rotatedS = status
for _ in range(4):
rotatedS = self.rotate(rotatedS)
ret.append(rotatedS)
return ret

def rotate(self, status: Tuple[Tuple[int, ...]]) -> Tuple[Tuple[int, ...]]:
N = len(status)
board = [[ConnectNGame.AVAILABLE] * N for _ in range(N)]

for r in range(N):
for c in range(N):
board[c][N - 1 - r] = status[r][c]

return tuple([tuple(board[i]) for i in range(N)])

Minimax Strategy Precomputation

In last version of Minimax DP strategy implementation, we searched best game result given a game state. In the computation, we also leveraged pruning to shortcut if the result is already best. However, for AI agent, we still have to call minimax for each new game state encountered. This is very inefficient because we are solving same game states again and again during top down recursion. An obvious improvement is to compute all game states in first step and cache them all. Later for each given state encountered, we only need to aggregate result by looking at all possible next move positions of that game state. Code of aggregating possible moves is listed below.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class PlannedMinimaxStrategy(Strategy):
def __init__(self, game: ConnectNGame):
super().__init__()
self.game = copy.deepcopy(game)
self.dpMap = {} # game_status => result, move
self.result = self.minimax(game.getStatus())


def action(self, game: ConnectNGame) -> Tuple[int, Tuple[int, int]]:
game = copy.deepcopy(game)

player = game.currentPlayer
bestResult = player * -1 # assume opponent win as worst result
bestMove = None
for move in game.getAvailablePositions():
game.move(*move)
status = game.getStatus()
game.undo()

result = self.dpMap[status]

if player == ConnectNGame.PLAYER_A:
bestResult = max(bestResult, result)
else:
bestResult = min(bestResult, result)
# update bestMove if any improvement
bestMove = move if bestResult == result else bestMove
print(f'move {move} => {result}')

return bestResult, bestMove

Agent Class and Playing Logic

We also construct Agent class hierarchy, allowing AI player and human player to share common code.

BaseAgent is the root class with default act() method being making random decisions over all available actions.

{linenos
1
2
3
4
5
6
class BaseAgent(object):
def __init__(self):
pass

def act(self, game: PyGameBoard, available_actions):
return random.choice(available_actions)

AIAgent has its act() behavior delegated to strategy's action() method.

{linenos
1
2
3
4
5
6
7
8
class AIAgent(BaseAgent):
def __init__(self, strategy: Strategy):
self.strategy = strategy

def act(self, game: PyGameBoard, available_actions):
result, move = self.strategy.action(game.connectNGame)
assert move in available_actions
return move

HumanAgent delegates act() to PyGameBoard next_user_input().

{linenos
1
2
3
4
5
6
class HumanAgent(BaseAgent):
def __init__(self):
pass

def act(self, game: PyGameBoard, available_actions):
return game.next_user_input()
Agent Class Diagram

Below are code snippets showing how Agent, ConnectNGym, PyGameBoard are connected together to make game play.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def play_ai_vs_ai(env: ConnectNGym):
plannedMinimaxAgent = AIAgent(PlannedMinimaxStrategy(env.pygameBoard.connectNGame))
play(env, plannedMinimaxAgent, plannedMinimaxAgent)


def play(env: ConnectNGym, agent1: BaseAgent, agent2: BaseAgent):
agents = [agent1, agent2]

while True:
env.reset()
done = False
agent_id = -1
while not done:
agent_id = (agent_id + 1) % 2
available_actions = env.get_available_actions()
agent = agents[agent_id]
action = agent.act(pygameBoard, available_actions)
_, reward, done, info = env.step(action)
env.render(True)

if done:
print(f'result={reward}')
time.sleep(3)
break


if __name__ == '__main__':
pygameBoard = PyGameBoard(connectNGame=ConnectNGame(board_size=3, N=3))
env = ConnectNGym(pygameBoard)
env.render(True)

play_ai_vs_ai(env)
Class Diagram Overview

This episode extends last one, where Minimax and Alpha Beta Pruning algorithms are introduced. We will solve several tic-tac-toe problems in leetcode, gathering intuition and building blocks for tic-tac-toe game logic, which can be naturally extended to Connect-N game or Gomoku (N=5). Then we solve tic-tac-toe using Minimax and Alpha Beta pruning for small N and analyze their state space. In the following episodes, based on building blocks here, we will implement a Connect-N Open Gym GUI Environment, where we can play against computer visually or compare different computer algorithms. Finally, we demonstrate how to implement a Monte Carlo Tree Search for Connect-N Game.

Leetcode Tic-Tac-Toe Problems

1275. Find Winner on a Tic Tac Toe Game (Easy)

Tic-tac-toe is played by two players A and B on a 3 x 3 grid.
Here are the rules of Tic-Tac-Toe:
Players take turns placing characters into empty squares (" ").
The first player A always places "X" characters, while the second player B always places "O" characters.
"X" and "O" characters are always placed into empty squares, never on filled ones.
The game ends when there are 3 of the same (non-empty) character filling any row, column, or diagonal.
The game also ends if all squares are non-empty.
No more moves can be played if the game is over. Given an array moves where each element is another array of size 2 corresponding to the row and column of the grid where they mark their respective character in the order in which A and B play.
Return the winner of the game if it exists (A or B), in case the game ends in a draw return "Draw", if there are still movements to play return "Pending".
You can assume that moves is valid (It follows the rules of Tic-Tac-Toe), the grid is initially empty and A will play first.

Example 1:
Input: moves = [[0,0],[2,0],[1,1],[2,1],[2,2]]
Output: "A"
Explanation: "A" wins, he always plays first.
"X " "X " "X " "X " "X "
" " -> " " -> " X " -> " X " -> " X "
" " "O " "O " "OO " "OOX"

Example 2:
Input: moves = [[0,0],[1,1],[0,1],[0,2],[1,0],[2,0]]
Output: "B"
Explanation: "B" wins.
"X " "X " "XX " "XXO" "XXO" "XXO"
" " -> " O " -> " O " -> " O " -> "XO " -> "XO "
" " " " " " " " " " "O "

Example 3:
Input: moves = [[0,0],[1,1],[2,0],[1,0],[1,2],[2,1],[0,1],[0,2],[2,2]]
Output: "Draw"
Explanation: The game ends in a draw since there are no moves to make.
"XXO"
"OOX"
"XOX"

Example 4:
Input: moves = [[0,0],[1,1]]
Output: "Pending"
Explanation: The game has not finished yet.
"X "
" O "
" "

The intuitive solution is to permute all 8 possible winning conditions: 3 vertical lines, 3 horizontal lines and 2 diagonal lines. We keep 8 variables representing each winning condition and a simple trick is converting board state to a 3x3 2d array, whose cell has value -1, 1, and 0. In this way, we can traverse the board state exactly once and in the process determine all 8 variables value by summing corresponding cell value. For example, row[0] is for first line winning condition, summed by all 3 cells in first row during board traveral. It indicates win for first player only when it's equal to 3 and win for second player when it's -3.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# AC
from typing import List

class Solution:
def tictactoe(self, moves: List[List[int]]) -> str:
board = [[0] * 3 for _ in range(3)]
for idx, xy in enumerate(moves):
player = 1 if idx % 2 == 0 else -1
board[xy[0]][xy[1]] = player

turn = 0
row, col = [0, 0, 0], [0, 0, 0]
diag1, diag2 = False, False
for r in range(3):
for c in range(3):
turn += board[r][c]
row[r] += board[r][c]
col[c] += board[r][c]
if r == c:
diag1 += board[r][c]
if r + c == 2:
diag2 += board[r][c]

oWin = any(row[r] == 3 for r in range(3)) or any(col[c] == 3 for c in range(3)) or diag1 == 3 or diag2 == 3
xWin = any(row[r] == -3 for r in range(3)) or any(col[c] == -3 for c in range(3)) or diag1 == -3 or diag2 == -3

return "A" if oWin else "B" if xWin else "Draw" if len(moves) == 9 else "Pending"

Below we give another AC solution. Despite more code, it's more efficient than previous one because for a given game state, it does not need to visit each cell on the board. How is it achieved? The problem guarentees each move is valid, so what's sufficent to examine is to check neighbours of the final move and see if any line including final move creates a winning condition. Later we will reuse the code in this solution to create tic-tac-toe game logic.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# AC
from typing import List

class Solution:
def checkWin(self, r: int, c: int) -> bool:
north = self.getConnectedNum(r, c, -1, 0)
south = self.getConnectedNum(r, c, 1, 0)

east = self.getConnectedNum(r, c, 0, 1)
west = self.getConnectedNum(r, c, 0, -1)

south_east = self.getConnectedNum(r, c, 1, 1)
north_west = self.getConnectedNum(r, c, -1, -1)

north_east = self.getConnectedNum(r, c, -1, 1)
south_west = self.getConnectedNum(r, c, 1, -1)

if (north + south + 1 >= 3) or (east + west + 1 >= 3) or \
(south_east + north_west + 1 >= 3) or (north_east + south_west + 1 >= 3):
return True
return False

def getConnectedNum(self, r: int, c: int, dr: int, dc: int) -> int:
player = self.board[r][c]
result = 0
i = 1
while True:
new_r = r + dr * i
new_c = c + dc * i
if 0 <= new_r < 3 and 0 <= new_c < 3:
if self.board[new_r][new_c] == player:
result += 1
else:
break
else:
break
i += 1
return result

def tictactoe(self, moves: List[List[int]]) -> str:
self.board = [[0] * 3 for _ in range(3)]
for idx, xy in enumerate(moves):
player = 1 if idx % 2 == 0 else -1
self.board[xy[0]][xy[1]] = player

# only check last move
r, c = moves[-1]
win = self.checkWin(r, c)
if win:
return "A" if len(moves) % 2 == 1 else "B"

return "Draw" if len(moves) == 9 else "Pending"

Leetcode 794. Valid Tic-Tac-Toe State(Medium)

A Tic-Tac-Toe board is given as a string array board. Return True if and only if it is possible to reach this board position during the course of a valid tic-tac-toe game.
The board is a 3 x 3 array, and consists of characters " ", "X", and "O". The " " character represents an empty square.
Here are the rules of Tic-Tac-Toe:
Players take turns placing characters into empty squares (" ").
The first player A always places "X" characters, while the second player B always places "O" characters.
"X" and "O" characters are always placed into empty squares, never on filled ones.
The game ends when there are 3 of the same (non-empty) character filling any row, column, or diagonal.
The game also ends if all squares are non-empty.
No more moves can be played if the game is over.

Example 1:
Input: board = ["O ", " ", " "]
Output: false
Explanation: The first player always plays "X".

Example 2:
Input: board = ["XOX", " X ", " "]
Output: false
Explanation: Players take turns making moves.

Example 3:
Input: board = ["XXX", " ", "OOO"]
Output: false

Example 4:
Input: board = ["XOX", "O O", "XOX"]
Output: true

Note: board is a length-3 array of strings, where each string board[i] has length 3.
Each board[i][j] is a character in the set {" ", "X", "O"}.

Surely, it can be solved using DFS, checking if the state given would be reached from initial state. However, this involves lots of states to search. Could we do better? There are obvious properties we can rely on. For example, the number of X is either equal to the number of O or one more. If we can enumerate a combination of necessary and sufficient conditions of checking its reachability, we can solve it in O(1) time complexity.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# AC
from typing import List

class Solution:

def convertCell(self, c:str):
return 1 if c == 'X' else -1 if c == 'O' else 0

def validTicTacToe(self, board: List[str]) -> bool:
turn = 0
row, col = [0, 0, 0], [0, 0, 0]
diag1, diag2 = False, False
for r in range(3):
for c in range(3):
turn += self.convertCell(board[r][c])
row[r] += self.convertCell(board[r][c])
col[c] += self.convertCell(board[r][c])
if r == c:
diag1 += self.convertCell(board[r][c])
if r + c == 2:
diag2 += self.convertCell(board[r][c])

xWin = any(row[r] == 3 for r in range(3)) or any(col[c] == 3 for c in range(3)) or diag1 == 3 or diag2 == 3
oWin = any(row[r] == -3 for r in range(3)) or any(col[c] == -3 for c in range(3)) or diag1 == -3 or diag2 == -3
if (xWin and turn == 0) or (oWin and turn == 1):
return False
return (turn == 0 or turn == 1) and (not xWin or not oWin)

Leetcode 348. Design Tic-Tac-Toe (Medium, Locked)

Design a Tic-tac-toe game that is played between two players on a n x n grid.
You may assume the following rules:
A move is guaranteed to be valid and is placed on an empty block.
Once a winning condition is reached, no more moves is allowed.
A player who succeeds in placing n of their marks in a horizontal, vertical, or diagonal row wins the game.

Example:
Given n = 3, assume that player 1 is "X" and player 2 is "O" in the board.
TicTacToe toe = new TicTacToe(3);

toe.move(0, 0, 1); -> Returns 0 (no one wins)
|X| | |
| | | | // Player 1 makes a move at (0, 0).
| | | |

toe.move(0, 2, 2); -> Returns 0 (no one wins)
|X| |O|
| | | | // Player 2 makes a move at (0, 2).
| | | |

toe.move(2, 2, 1); -> Returns 0 (no one wins)
|X| |O|
| | | | // Player 1 makes a move at (2, 2).
| | |X|

toe.move(1, 1, 2); -> Returns 0 (no one wins)
|X| |O|
| |O| | // Player 2 makes a move at (1, 1).
| | |X|

toe.move(2, 0, 1); -> Returns 0 (no one wins)
|X| |O|
| |O| | // Player 1 makes a move at (2, 0).
|X| |X|

toe.move(1, 0, 2); -> Returns 0 (no one wins)
|X| |O|
|O|O| | // Player 2 makes a move at (1, 0).
|X| |X|

toe.move(2, 1, 1); -> Returns 1 (player 1 wins)
|X| |O|
|O|O| | // Player 1 makes a move at (2, 1).
|X|X|X|

Follow up:
Could you do better than O(n2) per move() operation?

348 is a locked problem. For each player's move, we can resort to checkWin function in second solution for 1275. We show another solution based on first solution of 1275, where 8 winning condition flags are kept and each move only touches associated several flag variables.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# AC
class TicTacToe:

def __init__(self, n:int):
"""
Initialize your data structure here.
:type n: int
"""
self.row, self.col, self.diag1, self.diag2, self.n = [0] * n, [0] * n, 0, 0, n

def move(self, row:int, col:int, player:int) -> int:
"""
Player {player} makes a move at ({row}, {col}).
@param row The row of the board.
@param col The column of the board.
@param player The player, can be either 1 or 2.
@return The current winning condition, can be either:
0: No one wins.
1: Player 1 wins.
2: Player 2 wins.
"""
if player == 2:
player = -1

self.row[row] += player
self.col[col] += player
if row == col:
self.diag1 += player
if row + col == self.n - 1:
self.diag2 += player

if self.n in [self.row[row], self.col[col], self.diag1, self.diag2]:
return 1
if -self.n in [self.row[row], self.col[col], self.diag1, self.diag2]:
return 2
return 0


Optimal Strategy of Tic-Tac-Toe

Tic-tac-toe and Gomoku (Connect Five in a Row) share the same rules and are generally considered as M,n,k-game, where board size range to M x N and winning condition changes to k.

ConnectNGame class implements M,n,k-game of MxM board size. It encapsulates the logic of checking each move and also is able to undo last move to facilitate backtrack in game search algorithm later.

ConnectNGame

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
class ConnectNGame:

PLAYER_A = 1
PLAYER_B = -1
AVAILABLE = 0
RESULT_TIE = 0
RESULT_A_WIN = 1
RESULT_B_WIN = -1

def __init__(self, N:int = 3, board_size:int = 3):
assert N <= board_size
self.N = N
self.board_size = board_size
self.board = [[ConnectNGame.AVAILABLE] * board_size for _ in range(board_size)]
self.gameOver = False
self.gameResult = None
self.currentPlayer = ConnectNGame.PLAYER_A
self.remainingPosNum = board_size * board_size
self.actionStack = []

def move(self, r: int, c: int) -> int:
"""

:param r:
:param c:
:return: None: game ongoing
"""
assert self.board[r][c] == ConnectNGame.AVAILABLE
self.board[r][c] = self.currentPlayer
self.actionStack.append((r, c))
self.remainingPosNum -= 1
if self.checkWin(r, c):
self.gameOver = True
self.gameResult = self.currentPlayer
return self.currentPlayer
if self.remainingPosNum == 0:
self.gameOver = True
self.gameResult = ConnectNGame.RESULT_TIE
return ConnectNGame.RESULT_TIE
self.currentPlayer *= -1

def undo(self):
if len(self.actionStack) > 0:
lastAction = self.actionStack.pop()
r, c = lastAction
self.board[r][c] = ConnectNGame.AVAILABLE
self.currentPlayer = ConnectNGame.PLAYER_A if len(self.actionStack) % 2 == 0 else ConnectNGame.PLAYER_B
self.remainingPosNum += 1
self.gameOver = False
self.gameResult = None
else:
raise Exception('No lastAction')

def getAvailablePositions(self) -> List[Tuple[int, int]]:
return [(i,j) for i in range(self.board_size) for j in range(self.board_size) if self.board[i][j] == ConnectNGame.AVAILABLE]

def getStatus(self) -> Tuple[Tuple[int, ...]]:
return tuple([tuple(self.board[i]) for i in range(self.board_size)])

Note that checkWin code is identical to second solution in 1275.

Minimax Strategy

Now we have Connect-N game logic, let's finish its minimax algorithm to solve the game.

Define a generic strategy base class, where action method needs to be overridden. Action method expects ConnectNGame class telling current game state and returns a tuple of 2 elements, the first element is the estimated or exact game result after taking action specified by second element. The second element is of form Tuple[int, int], denoting the position of the move, for instance, (1,1).

{linenos
1
2
3
4
5
6
7
8
class Strategy(ABC):

def __init__(self):
super().__init__()

@abstractmethod
def action(self, game: ConnectNGame) -> Tuple[int, Tuple[int, int]]:
pass
MinimaxStrategy code is very similar to previous minimax algorithms. The only added piece is the corresponding move returned by action method.
{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
class MinimaxStrategy(Strategy):
def action(self, game: ConnectNGame) -> Tuple[int, Tuple[int, int]]:
self.game = copy.deepcopy(game)
result, move = self.minimax()
return result, move

def minimax(self) -> Tuple[int, Tuple[int, int]]:
game = self.game
bestMove = None
assert not game.gameOver
if game.currentPlayer == ConnectNGame.PLAYER_A:
ret = -math.inf
for pos in game.getAvailablePositions():
move = pos
result = game.move(*pos)
if result is None:
assert not game.gameOver
result, oppMove = self.minimax()
game.undo()
ret = max(ret, result)
bestMove = move if ret == result else bestMove
if ret == 1:
return 1, move
return ret, bestMove
else:
ret = math.inf
for pos in game.getAvailablePositions():
move = pos
result = game.move(*pos)
if result is None:
assert not game.gameOver
result, oppMove = self.minimax()
game.undo()
ret = min(ret, result)
bestMove = move if ret == result else bestMove
if ret == -1:
return -1, move
return ret, bestMove
We plot up to first 2 moves with code above. For first player O, there are possibly 9 positions, where due to symmetry, only 3 kinds of moves, which we call corner, edge and center, respectively. The following graph shows whatever 9 positions the first player takes, the best result is draw. So solution of tic-tac-toe is draw.
Tic-tac-toe 9 First Step
Plot first step of 3 kinds of moves one by one below.
Tic-tac-toe First Step Corner
Tic-tac-toe First Step Edge
Tic-tac-toe First Step Center

Tic-tac-toe Solution and Number of States

An interesting question is the number of game states of tic-tac-toe. A loosely upper bound can be derived by \(3^9=19683\), which includes lots of inreachable states. This article Tic-Tac-Toe (Naughts and Crosses, Cheese and Crackers, etc lists number of states after each move. The total number is 5478.

Moves Positions Terminal Positions
0 1
1 9
2 72
3 252
4 756
5 1260 120
6 1520 148
7 1140 444
8 390 168
9 78 78
Total 5478 958

We can verify the number if we change a little of existing code to code below.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

class CountingMinimaxStrategy(Strategy):
def action(self, game: ConnectNGame) -> Tuple[int, Tuple[int, int]]:
self.game = copy.deepcopy(game)
self.dpMap = {}
result, move = self.minimax(game.getStatus())
return result, move

def minimax(self, gameStatus: Tuple[Tuple[int, ...]]) -> Tuple[int, Tuple[int, int]]:
# print(f'Current {len(strategy.dpMap)}')

if gameStatus in self.dpMap:
return self.dpMap[gameStatus]

game = self.game
bestMove = None
assert not game.gameOver
if game.currentPlayer == ConnectNGame.PLAYER_A:
ret = -math.inf
for pos in game.getAvailablePositions():
move = pos
result = game.move(*pos)
if result is None:
assert not game.gameOver
result, oppMove = self.minimax(game.getStatus())
self.dpMap[game.getStatus()] = result, oppMove
else:
self.dpMap[game.getStatus()] = result, move
game.undo()
ret = max(ret, result)
bestMove = move if ret == result else bestMove
self.dpMap[gameStatus] = ret, bestMove
return ret, bestMove
else:
ret = math.inf
for pos in game.getAvailablePositions():
move = pos
result = game.move(*pos)

if result is None:
assert not game.gameOver
result, oppMove = self.minimax(game.getStatus())
self.dpMap[game.getStatus()] = result, oppMove
else:
self.dpMap[game.getStatus()] = result, move
game.undo()
ret = min(ret, result)
bestMove = move if ret == result else bestMove
self.dpMap[gameStatus] = ret, bestMove
return ret, bestMove


if __name__ == '__main__':
tic_tac_toe = ConnectNGame(N=3, board_size=3)
strategy = CountingMinimaxStrategy()
strategy.action(tic_tac_toe)
print(f'Game States Number {len(strategy.dpMap)}')

Running the code proves the total number is 5478. Also illustrate some small scale game configuration results.

3x3 4x4
k=3 5478 (Draw) 6035992 (Win)
k=4 9722011 (Draw)
k=5

According to Wikipedia M,n,k-game, below are results for some game configuration.

3x3 4x4 5x5 6x6
k=3 Draw Win Win Win
k=4 Draw Draw Win
k=5 Draw Draw

What's worth mentioning is that Gomoku (Connect Five in a Row), of board size MxM >= 15x15 is proved by L. Victor Allis to be Win.

Alpha-Beta Pruning Strategy

Alpha Beta Pruning Strategy is pasted below.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
class AlphaBetaStrategy(Strategy):
def action(self, game: ConnectNGame) -> Tuple[int, Tuple[int, int]]:
self.game = game
result, move = self.alpha_beta(self.game.getStatus(), -math.inf, math.inf)
return result, move

def alpha_beta(self, gameStatus: Tuple[Tuple[int, ...]], alpha:int=None, beta:int=None) -> Tuple[int, Tuple[int, int]]:
game = self.game
bestMove = None
assert not game.gameOver
if game.currentPlayer == ConnectNGame.PLAYER_A:
ret = -math.inf
for pos in game.getAvailablePositions():
move = pos
result = game.move(*pos)
if result is None:
assert not game.gameOver
result, oppMove = self.alpha_beta(game.getStatus(), alpha, beta)
game.undo()
alpha = max(alpha, result)
ret = max(ret, result)
bestMove = move if ret == result else bestMove
if alpha >= beta or ret == 1:
return ret, move
return ret, bestMove
else:
ret = math.inf
for pos in game.getAvailablePositions():
move = pos
result = game.move(*pos)
if result is None:
assert not game.gameOver
result, oppMove = self.alpha_beta(game.getStatus(), alpha, beta)
game.undo()
beta = min(beta, result)
ret = min(ret, result)
bestMove = move if ret == result else bestMove
if alpha >= beta or ret == -1:
return ret, move
return ret, bestMove

Rewrite alpha beta pruning with DP, where we omit alpha and beta parameters in alpha_beta_dp because lru_cache cannot specify effective parameters. Instead, we keep alpha and beta in a stack variable and maintain the stack according to alpha_bate_dp calling stack.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
class AlphaBetaDPStrategy(Strategy):
def action(self, game: ConnectNGame) -> Tuple[int, Tuple[int, int]]:
self.game = game
self.alphaBetaStack = [(-math.inf, math.inf)]
result, move = self.alpha_beta_dp(self.game.getStatus())
return result, move

@lru_cache(maxsize=None)
def alpha_beta_dp(self, gameStatus: Tuple[Tuple[int, ...]]) -> Tuple[int, Tuple[int, int]]:
alpha, beta = self.alphaBetaStack[-1]
game = self.game
bestMove = None
assert not game.gameOver
if game.currentPlayer == ConnectNGame.PLAYER_A:
ret = -math.inf
for pos in game.getAvailablePositions():
move = pos
result = game.move(*pos)
if result is None:
assert not game.gameOver
self.alphaBetaStack.append((alpha, beta))
result, oppMove = self.alpha_beta_dp(game.getStatus())
self.alphaBetaStack.pop()
game.undo()
alpha = max(alpha, result)
ret = max(ret, result)
bestMove = move if ret == result else bestMove
if alpha >= beta or ret == 1:
return ret, move
return ret, bestMove
else:
ret = math.inf
for pos in game.getAvailablePositions():
move = pos
result = game.move(*pos)
if result is None:
assert not game.gameOver
self.alphaBetaStack.append((alpha, beta))
result, oppMove = self.alpha_beta_dp(game.getStatus())
self.alphaBetaStack.pop()
game.undo()
beta = min(beta, result)
ret = min(ret, result)
bestMove = move if ret == result else bestMove
if alpha >= beta or ret == -1:
return ret, move
return ret, bestMove

This series, we deal with zero-sum turn-based board game algorithm, a sub type of combinatorial games. We start off with small search space problem, introduce classic algorithms and corresponding combinatorial gaming theory and ultimately end with modern approximating Deep RL techniques. From there, after stepping stone is laid, we are able to learn and appreciate how AlphaGo works. In this first episode, we illustrate 3 classic gaming problems in leetcode and solve them from brute force version to DP version then finally rewrite them using classic gaming algorithms, minimax and alpha beta pruning.

Leetcode 292 Nim Game (Easy)

Let's start with an easy Leetcode gaming problem, Leetcode 292 Nim Game.

You are playing the following Nim Game with your friend: There is a heap of stones on the table, each time one of you take turns to remove 1 to 3 stones. The one who removes the last stone will be the winner. You will take the first turn to remove the stones.
Both of you are very clever and have optimal strategies for the game. Write a function to determine whether you can win the game given the number of stones in the heap.

Example:
Input: 4
Output: false
Explanation: If there are 4 stones in the heap, then you will never win the game;
No matter 1, 2, or 3 stones you remove, the last stone will always be removed by your friend.

Let \(f(n)\) be the result, either Win or Lose, when you take turn to make optimal move for the case of \(n\) stones. The first non trial case is \(f(4)\). By playing optimal strategies, it is equivalent to saying if there is any chance that leads to Win, you will definitely choose it. So you try 1, 2, 3 stones and see whether your opponent has any chance to win. Obviously, \(f(1) = f(2) = f(3) = Win\). Therefore, \(f(4)\) is guranteed to lose. Generally, the recurrence relation is given by \[ f(n) = \neg (f(n-1) \land f(n-2) \land f(n-3)) \]

This translates straightforwardly to following Python 3 code.
{linenos
1
2
3
4
5
6
7
8
9
10
11
# TLE
# Time Complexity: O(exponential)
class Solution_BruteForce:

def canWinNim(self, n: int) -> bool:
if n <= 3:
return True
for i in range(1, 4):
if not self.canWinNim(n - i):
return True
return False
Since this brute force version has same recursive manner as fibonacci number, the complexity is exponential so it won't pass test. This can be visually verified by following call graph. Notice, node 5 is expanded entirely twice and node 4 is expanded 4 times.
292 Nim Game Brute Force Call Graph, n=7
As what we optimize for computing fibonacci, we cache the result for smaller number and compute larger value based on previous ones. In Python, we can achieve the DP cache effect by merely adding one line, the magical decorator lru_cache. In this way, runtime complexity is drastically reduced to \(O(N)\).
{linenos
1
2
3
4
5
6
7
8
9
10
11
12
# RecursionError: maximum recursion depth exceeded in comparison n=1348820612
# Time Complexity: O(N)
class Solution_DP:
from functools import lru_cache
@lru_cache(maxsize=None)
def canWinNim(self, n: int) -> bool:
if n <= 3:
return True
for i in range(1, 4):
if not self.canWinNim(n - i):
return True
return False
Plotting the call graph below helps to verify that. This time, node 5 and 4 are not explored to bottom multiple times. The green node denotes such cache hit.
292 Nim Game DP Call Graph, n=7

However, for this problem, lru_cache is not enough to AC because for large n, such as 1348820612, the implementation suffers from stack overflow. We can, of course, rewrite it in iterative forwarding loop manner. But still TLE.

{linenos
1
2
3
4
5
6
7
8
9
10
11
# TLE for 1348820612
# Time Complexity: O(N)
class Solution:
def canWinNim(self, n: int) -> bool:
if n <= 3:
return True
last3, last2, last1 = True, True, True
for i in range(4, n+1):
this = not (last3 and last2 and last1)
last3, last2, last1 = last2, last1, this
return last1

So AC code requires at most sublinear complexity. The last version also gives us some intuition that win lose may have period of 4. Actually, if you arrange all \(f(n)\) one by one, it's obvious that any \(n \mod 4 = 0\) leads to Lose and other cases lead to Win. Why? Suppose you start with \(4k+i (i=1,2,3)\), you can always remove \(i\) stones and leave \(4k\) stones to your opponent. Whatever he chooses, you are returned with situation \(4k_1 + i_1 (i_1 = 1,2,3)\). This pattern repeats until you have 1, 2, 3 remaining stones.

Win Lose Distribution

Below is one liner AC version.

{linenos
1
2
3
4
5
# AC
# Time Complexity: O(1)
class Solution:
def canWinNim(self, n: int) -> bool:
return not (n % 4 == 0)

Leetcode 486 Predict the Winner (Medium)

Let's exercise a harder problem, Leetcode 486 Predict the Winner.

Given an array of scores that are non-negative integers. Player 1 picks one of the numbers from either end of the array followed by the player 2 and then player 1 and so on. Each time a player picks a number, that number will not be available for the next player. This continues until all the scores have been chosen. The player with the maximum score wins.
Given an array of scores, predict whether player 1 is the winner. You can assume each player plays to maximize his score.

Example 1:
Input: [1, 5, 2]
Output: False
Explanation: Initially, player 1 can choose between 1 and 2.
If he chooses 2 (or 1), then player 2 can choose from 1 (or 2) and 5. If player 2 chooses 5, then player 1 will be left with 1 (or 2).
So, final score of player 1 is 1 + 2 = 3, and player 2 is 5.
Hence, player 1 will never be the winner and you need to return False.

Example 2:
Input: [1, 5, 233, 7]
Output: True
Explanation: Player 1 first chooses 1. Then player 2 have to choose between 5 and 7. No matter which number player 2 choose, player 1 can choose 233.
Finally, player 1 has more score (234) than player 2 (12), so you need to return True representing player1 can win.

For a player, he can choose leftmost or rightmost one and leave remaining array to his opponent. Let us define maxDiff(l, r) to be the maximum difference current player can get, who is facing situation of subarray \([l, r]\).

\[ \begin{equation*} \operatorname{maxDiff}(l, r) = \max \begin{cases} nums[l] - \operatorname{maxDiff}(l + 1, r)\\\\ nums[r] - \operatorname{maxDiff}(l, r - 1) \end{cases} \end{equation*} \]

Runtime complexity can be written as following recurrence. \[ f(n) = 2f(n-1) = O(2^n) \]

Surprisingly, this time brute force version passed, but on the edge of rejection (6300ms).

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# AC
# Time Complexity: O(2^N)
# Slow: 6300ms
from typing import List

class Solution:

def maxDiff(self, l: int, r:int) -> int:
if l == r:
return self.nums[l]
return max(self.nums[l] - self.maxDiff(l + 1, r), self.nums[r] - self.maxDiff(l, r - 1))

def PredictTheWinner(self, nums: List[int]) -> bool:
self.nums = nums
return self.maxDiff(0, len(nums) - 1) >= 0

Exponential runtime complexity can also be verified by call graph below.
486 Predict the Winner Brute Force Call Graph, n=4

Again, be aware we have repeated computation over same node, for example, [1-2] node is expanded entirely for the second time when going from root to right node. Applying the same lru_cache trick, the one liner decorating maxDiff, we passed again with runtime complexity \(O(n^2)\) and running time 43ms, trial change but substantial improvement!

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# AC
# Time Complexity: O(N^2)
# Fast: 43ms
from functools import lru_cache
from typing import List

class Solution:

@lru_cache(maxsize=None)
def maxDiff(self, l: int, r:int) -> int:
if l == r:
return self.nums[l]
return max(self.nums[l] - self.maxDiff(l + 1, r), self.nums[r] - self.maxDiff(l, r - 1))

def PredictTheWinner(self, nums: List[int]) -> bool:
self.nums = nums
return self.maxDiff(0, len(nums) - 1) >= 0

Taking look at DP version call graph, this time, [1-2] node is not re-computed in right branch.
486 Predict the Winner DP Call Graph, n=4

Leetcode 464 Can I Win (Medium)

A similar but slightly difficult problem is Leetcode 464 Can I Win, where bit mask with DP technique is employed.

In the "100 game," two players take turns adding, to a running total, any integer from 1..10. The player who first causes the running total to reach or exceed 100 wins.
What if we change the game so that players cannot re-use integers?
For example, two players might take turns drawing from a common pool of numbers of 1..15 without replacement until they reach a total >= 100.
Given an integer maxChoosableInteger and another integer desiredTotal, determine if the first player to move can force a win, assuming both players play optimally.
You can always assume that maxChoosableInteger will not be larger than 20 and desiredTotal will not be larger than 300.

Example
Input:
maxChoosableInteger = 10
desiredTotal = 11
Output:
false
Explanation:
No matter which integer the first player choose, the first player will lose.
The first player can choose an integer from 1 up to 10.
If the first player choose 1, the second player can only choose integers from 2 up to 10.
The second player will win by choosing 10 and get a total = 11, which is >= desiredTotal.
Same with other integers chosen by the first player, the second player will always win.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# AC
# Time Complexity: O:(2^m*m), m: maxChoosableInteger
class Solution:
from functools import lru_cache
@lru_cache(maxsize=None)
def recurse(self, status: int, currentTotal: int) -> bool:
for i in range(1, self.maxChoosableInteger + 1):
if not (status >> i & 1):
new_status = 1 << i | status
if currentTotal + i >= self.desiredTotal:
return True
if not self.recurse(new_status, currentTotal + i):
return True
return False


def canIWin(self, maxChoosableInteger: int, desiredTotal: int) -> bool:
self.maxChoosableInteger = maxChoosableInteger
self.desiredTotal = desiredTotal

sum = maxChoosableInteger * (maxChoosableInteger + 1) / 2
if sum < desiredTotal:
return False
return self.recurse(0, 0)

Because there are \(2^m\) states and for each state we need to probe at most \(m\) options, so the overall runtime complexity is \(O(m 2^m)\), where m is maxChoosableInteger.

Minimax Algorithm

Up till now, we've seen serveral zero-sum turn based gaming in leetcode. In fact, there is more general algorithm for this type of gaming, named, minimax algorithm with alternate moves. The general setting is that, two players play in turn. The first player is trying to maximize game value and second player trying to minimize game value. For example, the following graph shows all nodes, labelled by its value. Computing from bottom up, the first player (max) can get optimal value -7, assuming both players play optimially.

Wikipedia Minimax Example

Pseudo code in Python 3 is listed below.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
def minimax(node: Node, depth: int, maximizingPlayer: bool) -> int:
if depth == 0 or is_terminal(node):
return evaluate_terminal(node)
if maximizingPlayer:
value:int = −∞
for child in node:
value = max(value, minimax(child, depth − 1, False))
return value
else: # minimizing player
value := +∞
for child in node:
value = min(value, minimax(child, depth − 1, True))
return value

Minimax: 486 Predict the Winner

We know leetcode 486 Predict the Winner is zero-sum turn-based game. Hence, theoretically, we can come up with a minimax algorithm for it. But the difficulty lies in how we define value or utility for it. In previous section, we've defined maxDiff(l, r) to be the maximum difference for current player, who is left with sub array \([l, r]\). In the most basic case, where only one element x is left, it's intuitive to define +x for max player and -x for min player. If we merge it with minimax algorithm, it's naturally follows that, the total reward got by max player is \(+a_1 + a_2 + ... = A\) and reward by min player is \(-b_1 - b_2 - ... = -B\), and max player aims to \(max(A-B)\) while min player aims to \(min(A-B)\). With that in mind, code is not hard to implement.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# AC
from functools import lru_cache
from typing import List

class Solution:
# max_player: max(A - B)
# min_player: min(A - B)
@lru_cache(maxsize=None)
def minimax(self, l: int, r: int, isMaxPlayer: bool) -> int:
if l == r:
return self.nums[l] * (1 if isMaxPlayer else -1)

if isMaxPlayer:
return max(
self.nums[l] + self.minimax(l + 1, r, not isMaxPlayer),
self.nums[r] + self.minimax(l, r - 1, not isMaxPlayer))
else:
return min(
-self.nums[l] + self.minimax(l + 1, r, not isMaxPlayer),
-self.nums[r] + self.minimax(l, r - 1, not isMaxPlayer))

def PredictTheWinner(self, nums: List[int]) -> bool:
self.nums = nums
v = self.minimax(0, len(nums) - 1, True)
return v >= 0
Minimax 486 Case [1, 5, 2, 4]

Minimax: 464 Can I Win

For this problem, as often processed in other win-lose-tie game without intermediate intrinsic value, it's typically to define +1 in case max player wins, -1 for min player and 0 for tie. Note the shortcut case for both player. For example, the max player can report Win (value=1) once he finds winning condition (>=desiredTotal) is satisfied during enumerating possible moves he can make. This also makes sense since if he gets 1 during maxing, there can not be other value for further probing that is finally returned. The same optimization will be generalized in the next improved algorithm, alpha beta pruning.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# AC
class Solution:
from functools import lru_cache
@lru_cache(maxsize=None)
# currentTotal < desiredTotal
def minimax(self, status: int, currentTotal: int, isMaxPlayer: bool) -> int:
import math
if status == self.allUsed:
return 0 # draw: no winner

if isMaxPlayer:
value = -math.inf
for i in range(1, self.maxChoosableInteger + 1):
if not (status >> i & 1):
new_status = 1 << i | status
if currentTotal + i >= self.desiredTotal:
return 1 # shortcut
value = max(value, self.minimax(new_status, currentTotal + i, not isMaxPlayer))
if value == 1:
return 1
return value
else:
value = math.inf
for i in range(1, self.maxChoosableInteger + 1):
if not (status >> i & 1):
new_status = 1 << i | status
if currentTotal + i >= self.desiredTotal:
return -1 # shortcut
value = min(value, self.minimax(new_status, currentTotal + i, not isMaxPlayer))
if value == -1:
return -1
return value

Alpha-Beta Pruning

We sensed there is space of optimaization during searching, as illustrated in 464 Can I Win minimax algorithm. Let's formalize this idea, called alpha beta pruning. For each node, we maintain two values alpha and beta, which represent the minimum score that the maximizing player is assured of and the maximum score that the minimizing player is assured of, respectively. The root node has initial alpha = −∞ and beta = +∞, forming valid duration [−∞, +∞]. During top down traversal, child node inherits alpha beta value from its parent node, for example, [alpha, beta], if the updated alpha or beta in the child node no longer forms a valid interval, the branch can be pruned and return immediately. Take following example in Wikimedia for example.

  1. Root node, intially: alpha = −∞, beta = +∞

  2. Root node, after 4 is returned, alpha = 4, beta = +∞

  3. Root node, after 5 is returned, alpha = 5, beta = +∞

  4. Rightmost Min node, intially: alpha = 5, beta = +∞

  5. Rightmost Min node, after 1 is returned: alpha = 5, beta = 1

Here we see [5, 1] no longer is valid interval, so it returns without further probing his 2nd and 3rd child. Why? because if the other child returns value > 1, say 2, it will be replaced by 1 as it's a min node with guarenteed value 1. If the other child returns value < 1, it will be abandoned by root node, a max node, which has already guarenteed to have value >=5. So in this situation, whatever other children return does not impact anything.

Wikimedia Alpha Beta Pruning Example

Pseudo code in Python 3 is listed below.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def alpha_beta(node: Node, depth: int, α: int, β: int, maximizingPlayer: bool) -> int:
if depth == 0 or is_terminal(node):
return evaluate_terminal(node)
if maximizingPlayer:
value: int = −∞
for child in node:
value = max(value, alphabeta(child, depth − 1, α, β, False))
α = max(α, value)
if α >= β:
break # β cut-off
return value
else:
value: int = +∞
for child in node:
value = min(value, alphabeta(child, depth − 1, α, β, True))
β = min(β, value)
if β <= α:
break # α cut-off
return value

Alpha-Beta Pruning: 486 Predict the Winner

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# AC
import math
from functools import lru_cache
from typing import List

class Solution:
def alpha_beta(self, l: int, r: int, curr: int, isMaxPlayer: bool, alpha: int, beta: int) -> int:
if l == r:
return curr + self.nums[l] * (1 if isMaxPlayer else -1)

if isMaxPlayer:
ret = self.alpha_beta(l + 1, r, curr + self.nums[l], not isMaxPlayer, alpha, beta)
alpha = max(alpha, ret)
if alpha >= beta:
return alpha
ret = max(ret, self.alpha_beta(l, r - 1, curr + self.nums[r], not isMaxPlayer, alpha, beta))
return ret
else:
ret = self.alpha_beta(l + 1, r, curr - self.nums[l], not isMaxPlayer, alpha, beta)
beta = min(beta, ret)
if alpha >= beta:
return beta
ret = min(ret, self.alpha_beta(l, r - 1, curr - self.nums[r], not isMaxPlayer, alpha, beta))
return ret

def PredictTheWinner(self, nums: List[int]) -> bool:
self.nums = nums
v = self.alpha_beta(0, len(nums) - 1, 0, True, -math.inf, math.inf)
return v >= 0

Alpha-Beta Pruning: 464 Can I Win

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# AC
class Solution:
from functools import lru_cache
@lru_cache(maxsize=None)
# currentTotal < desiredTotal
def alpha_beta(self, status: int, currentTotal: int, isMaxPlayer: bool, alpha: int, beta: int) -> int:
import math
if status == self.allUsed:
return 0 # draw: no winner

if isMaxPlayer:
value = -math.inf
for i in range(1, self.maxChoosableInteger + 1):
if not (status >> i & 1):
new_status = 1 << i | status
if currentTotal + i >= self.desiredTotal:
return 1 # shortcut
value = max(value, self.alpha_beta(new_status, currentTotal + i, not isMaxPlayer, alpha, beta))
alpha = max(alpha, value)
if alpha >= beta:
return value
return value
else:
value = math.inf
for i in range(1, self.maxChoosableInteger + 1):
if not (status >> i & 1):
new_status = 1 << i | status
if currentTotal + i >= self.desiredTotal:
return -1 # shortcut
value = min(value, self.alpha_beta(new_status, currentTotal + i, not isMaxPlayer, alpha, beta))
beta = min(beta, value)
if alpha >= beta:
return value
return value

C++, Java, Javascript for 486 Predict the Winner

As a bonus, we AC leetcode 486 in C++, Java and Javascript with a bottom up iterative DP. We illustrate this method for other languages not just because lru_cache is available in non Python languages, but also because there are other ways to solve the problem. Notice the topological ordering of DP dependency, building larger DP based on smaller and solved ones. In addition, it's worth mentioning that this approach is guaranteed to have \(n^2\) loops but top down caching approach can have sub \(n^2\) loops.

Java AC Code

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// AC
class Solution {
public boolean PredictTheWinner(int[] nums) {
int n = nums.length;
int[][] dp = new int[n][n];
for (int i = 0; i < n; i++) {
dp[i][i] = nums[i];
}

for (int l = n - 1; l >= 0; l--) {
for (int r = l + 1; r < n; r++) {
dp[l][r] = Math.max(
nums[l] - dp[l + 1][r],
nums[r] - dp[l][r - 1]);
}
}
return dp[0][n - 1] >= 0;
}
}

C++ AC Code

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// AC
class Solution {
public:
bool PredictTheWinner(vector<int>& nums) {
int n = nums.size();
vector<vector<int>> dp(n, vector<int>(n, 0));
for (int i = 0; i < n; i++) {
dp[i][i] = nums[i];
}
for (int l = n - 1; l >= 0; l--) {
for (int r = l + 1; r < n; r++) {
dp[l][r] = max(nums[l] - dp[l + 1][r], nums[r] - dp[l][r - 1]);
}
}
return dp[0][n - 1] >= 0;
}
};

Javascript AC Code

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/**
* @param {number[]} nums
* @return {boolean}
*/
var PredictTheWinner = function(nums) {
const n = nums.length;
const dp = new Array(n).fill().map(() => new Array(n));

for (let i = 0; i < n; i++) {
dp[i][i] = nums[i];
}

for (let l = n - 1; l >=0; l--) {
for (let r = i + 1; r < n; r++) {
dp[l][r] = Math.max(nums[l] - dp[l + 1][r],nums[r] - dp[l][r - 1]);
}
}

return dp[0][n-1] >=0;
};

This is fifth episode of series: TSP From DP to Deep Learning. In this episode, we turn to Reinforcement Learning technology, in particular, a model-free policy gradient method that embeds pointer network to learn minimal tour without supervised best tour label in dataset. Full list of this series is listed below.

Pointer Network Refresher

In previous episode Pointer Networks in PyTorch, we implemented Pointer Networks in PyTorch with a 2D Euclidean dataset.

Recall that the input is a graph as a sequence of \(n\) cities in a two dimensional space

\[ s=\{\mathbf{x_i}\}_{i=1}^n, \mathbf{x}_{i} \in \mathbb{R}^{2} \]

The output is a permutation of the points \(\pi\), that visits each city exactly once and returns to starting point with minimal distance.

Let us define the total distance of a \(\pi\) with respect to \(s\) as \(L\)

\[ L(\pi | s)=\left\|\mathbf{x}_{\pi(n)}-\mathbf{x}_{\pi(1)}\right\|_{2}+\sum_{i=1}^{n-1}\left\|\mathbf{x}_{\pi(i)}-\mathbf{x}_{\pi(i+1)}\right\|_{2} \]

The stochastic policy \(p(\pi | s; \theta)\), parameterized by \(\theta\), is aiming to assign high probabilities to short tours and low probabilities to long tours. The joint probability assumes independency to allow factorization.

\[ p(\pi | s; \theta) = \prod_{i=1}^{n} p\left({\pi(i)} | {\pi(1)}, \ldots, {\pi(i-1)} , s; \theta\right) \]

The loss of the model is cross entropy between the network’s output probabilities \(\pi\) and the best tour \(\hat{\pi}\) generated by a TSP solver.

Contribution made by Pointer networks is that it addressed the constraint in that it allows for dynamic index value given by the particular test case, instead of from a fixed-size vocabulary.

Reinforcement Learning

Neural Combinatorial Optimization with Reinforcement Learning combines the power of Reinforcement Learning (RL) and Deep Learning to further eliminate the constraint required by Pointer Networks that the training dataset has to have supervised labels of best tour. With deep RL, test cases do not need to have a solution which is common pattern in deep RL. In the paper, a model-free policy-based RL method is adopted.

Model-Free Policy Gradient Methods

In the authoritative RL book, chapter 8 Planning and Learning with Tabular Methods, there are two major approaches in RL. One is model-based RL and the other is model-free RL. Distinction between the two relies on concept of model, which is stated as follows:

By a model of the environment we mean anything that an agent can use to predict how the environment will respond to its actions.

So model-based methods demand a model of the environment, and hence dynamic programming and heuristic search fall into this category. With model in mind, utility of the state can be computed in various ways and planning stage that essentially builds policy is needed before agent can take any action. In contrast, model-free methods, without building a model, are more direct, ignoring irrelevant information and just focusing on the policy which is ultimately needed. Typical examples of model-free methods are Monte Carlo Control and Temporal-Difference Learning. >Model-based methods rely on planning as their primary component, while model-free methods primarily rely on learning.

In TSP problem, the model is fully determined by all points given, and no feedback is generated for each decision made. So it's unclear to how to map state value with a tour. Therefore, we turn to model-free methods. In chapter 13 Policy Gradient Methods, a particular approximation model-free method that learns a parameterized policy that can select actions without consulting a value function. This approach fits perfectly with aforementioned pointer networks where the parameterized policy \(p(\pi | s; \theta)\) is already defined.

Training objective is obvious, the expected tour length of \(\pi_\theta\) which, given an input graph \(s\)

\[ J(\theta | s) = \mathbb{E}_{\pi \sim p_{\theta}(\cdot | s)} L(\pi | s) \]

Monte Carlo Policy Gradient: REINFORCE with Baseline

In order to find largest reward, a typical way is to optimize the parameters \(\theta\) in the direction of derivative: \(\nabla_{\theta} J(\theta | s)\).

\[ \nabla_{\theta} J(\theta | s)=\nabla_{\theta} \mathbb{E}_{\pi \sim p_{\theta}(\cdot | s)} L(\pi | s) \]

RHS of equation above is the derivative of expectation that we have no idea how to compute or approximate. Here comes the well-known REINFORCE trick that turns it into form of expectation of derivative, which can be approximated easily with Monte Carlo sampling, where the expectation is replaced by averaging.

\[ \nabla_{\theta} J(\theta | s)=\mathbb{E}_{\pi \sim p_{\theta}(. | s)}\left[L(\pi | s) \nabla_{\theta} \log p_{\theta}(\pi | s)\right] \]

Another common trick, subtracting a baseline \(b(s)\), leads the derivative of reward to the following equation. Note that \(b(s)\) denotes a baseline function that must not depend on \(\pi\). \[ \nabla_{\theta} J(\theta | s)=\mathbb{E}_{\pi \sim p_{\theta}(. | s)}\left[(L(\pi | s)-b(s)) \nabla_{\theta} \log p_{\theta}(\pi | s)\right] \]

The trick is explained in as:

Because the baseline could be uniformly zero, this update is a strict generalization of REINFORCE. In general, the baseline leaves the expected value of the update unchanged, but it can have a large effect on its variance.

Finally, the equation can be approximated with Monte Carlo sampling, assuming drawing \(B\) i.i.d: \(s_{1}, s_{2}, \ldots, s_{B} \sim \mathcal{S}\) and sampling a single tour per graph: $ {i} p{}(. | s_{i}) $, as follows \[ \nabla_{\theta} J(\theta) \approx \frac{1}{B} \sum_{i=1}^{B}\left(L\left(\pi_{i} | s_{i}\right)-b\left(s_{i}\right)\right) \nabla_{\theta} \log p_{\theta}\left(\pi_{i} | s_{i}\right) \]

Actor Critic Methods

REINFORCE with baseline works quite well but it also has disadvantage.

REINFORCE with baseline is unbiased and will converge asymptotically to a local minimum, but like all Monte Carlo methods it tends to learn slowly (produce estimates of high variance) and to be inconvenient to implement online or for continuing problems.

A typical improvement is actor–critic methods, that not only learn approximate policy, the actor job, but also learn approximate value funciton, the critic job. This is because it reduces variance and accelerates learning via a bootstrapping critic that introduce bias which is often beneficial. Detailed algorithm in the paper illustrated below.

\[ \begin{align*} &\textbf{Algorithm Actor-critic training} \\ &1: \quad \textbf{ procedure } \text{ TRAIN(training set }S \text{, training steps }T \text{, batch size } B \text{)} \\ &2: \quad \quad \text{Initialize pointer network params } \theta \\ &3: \quad \quad \text{Initialize critic network params } \theta_{v} \\ &4: \quad \quad \textbf{for }t=1 \text{ to } T \textbf{ do }\\ &5: \quad \quad \quad s_{i} \sim \operatorname{SAMPLE INPUT } (S) \text{ for } i \in\{1, \ldots, B\} \\ &6: \quad \quad \quad \pi_{i} \sim \operatorname{SAMPLE SOLUTION } \left(p_{\theta}\left(\cdot | s_{i}\right)\right) \text{ for } i \in\{1, \ldots, B\} \\ &7: \quad \quad \quad b_{i} \leftarrow b_{\theta_{v}}\left(s_{i}\right) \text{ for } i \in\{1, \ldots, B\} \\ &8: \quad \quad \quad g_{\theta} \leftarrow \frac{1}{B} \sum_{i=1}^{B}\left(L\left(\pi_{i} | s_{i}\right)-b_{i}\right) \nabla_{\theta} \log p_{\theta}\left(\pi_{i} | s_{i}\right) \\ &9: \quad \quad \quad \mathcal{L}_{v} \leftarrow \frac{1}{B} \sum_{i=1}^{B} \left\| b_{i}-L\left(\pi_{i}\right) \right\| _{2}^{2} \\ &10: \quad \quad \quad \theta \leftarrow \operatorname{ADAM} \left( \theta, g_{\theta} \right) \\ &11: \quad \quad \quad \theta_{v} \leftarrow \operatorname{ADAM}\left(\theta_{v}, \nabla_{\theta_{v}} \mathcal{L}_{v}\right) \\ &12: \quad \quad \textbf{end for} \\ &13: \quad \textbf{return } \theta \\ &14: \textbf{end procedure} \end{align*} \]

Implementation in PyTorch

Beam Search in OpenNMT-py

In Episode 4 Search for Most Likely Sequence, an 3x3 rectangle trellis is given and several decoding methods are illustrated in plain python. In PyTorch version, there is a package OpenNMT-py that supports efficient batched beam search. But due to its complicated BeamSearch usage, previous problem is demonstrated using its API. For its details, please refer to Implementing Beam Search — Part 1: A Source Code Analysis of OpenNMT-py

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
from copy import deepcopy
from math import exp
import torch
from onmt.translate import BeamSearch, GNMTGlobalScorer

def run_example():
BEAM_SIZE = 2
N_BEST = 1
BATCH_SZ = 1
SEQ_LEN = 3

initial = [0.35, 0.25, 0.4]
transition_matrix = [
[0.3, 0.6, 0.1],
[0.4, 0.2, 0.4],
[0.3, 0.4, 0.4]]

beam = BeamSearch(BEAM_SIZE, BATCH_SZ, 0, 1, 2, N_BEST, GNMTGlobalScorer(0.7, 0., "avg", "none"), 0, 30, False, 0, set(), False, 0.)
device_init = torch.zeros(1, 1)
beam.initialize(device_init, torch.randint(0, 30, (BATCH_SZ,)))

def printBestNPaths(beam: BeamSearch, step: int):
print(f'\nstep {step} beam results:')
for k in range(BEAM_SIZE):
best_path = beam.alive_seq[k].squeeze().tolist()[1:]
prob = exp(beam.topk_log_probs[0][k])
print(f'prob {prob:.3f} with path {best_path}')

init_scores = torch.log(torch.tensor([initial], dtype=torch.float))
init_scores = deepcopy(init_scores.repeat(BATCH_SZ * BEAM_SIZE, 1))
beam.advance(init_scores, None)
printBestNPaths(beam, 0)

for step in range(SEQ_LEN - 1):
idx_list = beam.topk_ids.squeeze().tolist()
beam_transition = []
for idx in idx_list:
beam_transition.append(transition_matrix[idx])
beam_transition_tensor = torch.log(torch.tensor(beam_transition))

beam.advance(beam_transition_tensor, None)
beam.update_finished()

printBestNPaths(beam, step + 1)

The output is as follows. When \(k=2\) and 3 steps, the most likely sequence is \(0 \rightarrow 1 \rightarrow 0\), whose probability is 0.084.

1
2
3
4
5
6
7
8
9
10
11
12
step 0 beam results:
prob 0.400 with path [2]
prob 0.350 with path [0]

step 1 beam results:
prob 0.210 with path [0, 1]
prob 0.160 with path [2, 1]

step 2 beam results:
prob 0.084 with path [0, 1, 0]
prob 0.000 with path [0, 1, 2]

RL with PointerNetwork

The complete code is on github TSP RL. Below are partial core classes.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
class CombinatorialRL(nn.Module):
actor: PointerNet

def __init__(self, rnn_type, use_embedding, embedding_size, hidden_size, seq_len, num_glimpse, tanh_exploration, use_tanh, attention):
super(CombinatorialRL, self).__init__()

self.actor = PointerNet(rnn_type, use_embedding, embedding_size, hidden_size, seq_len, num_glimpse, tanh_exploration, use_tanh, attention)

def forward(self, batch_input: Tensor) -> Tuple[Tensor, List[Tensor], List[Tensor], List[Tensor]]:
"""
Args:
batch_input: [batch_size * 2 * seq_len]
Returns:
R: Tensor of shape [batch_size]
action_prob_list: List of [seq_len], tensor shape [batch_size]
action_list: List of [seq_len], tensor shape [batch_size * 2]
action_idx_list: List of [seq_len], tensor shape [batch_size]
"""
batch_size = batch_input.size(0)
seq_len = batch_input.size(2)
prob_list, action_idx_list = self.actor(batch_input)

action_list = []
batch_input = batch_input.transpose(1, 2)
for action_id in action_idx_list:
action_list.append(batch_input[[x for x in range(batch_size)], action_id.data, :])
action_prob_list = []
for prob, action_id in zip(prob_list, action_idx_list):
action_prob_list.append(prob[[x for x in range(batch_size)], action_id.data])

R = self.reward(action_list)

return R, action_prob_list, action_list, action_idx_list

def reward(self, sample_solution: List[Tensor]) -> Tensor:
"""
Computes total distance of tour
Args:
sample_solution: list of size N, each tensor of shape [batch_size * 2]

Returns:
tour_len: [batch_size]

"""
batch_size = sample_solution[0].size(0)
n = len(sample_solution)
tour_len = Variable(torch.zeros([batch_size]))

for i in range(n - 1):
tour_len += torch.norm(sample_solution[i] - sample_solution[i + 1], dim=1)
tour_len += torch.norm(sample_solution[n - 1] - sample_solution[0], dim=1)
return tour_len

References

This is fourth episode of series: TSP From DP to Deep Learning. In this episode, we systematically compare different searching algorithms for finding most likely sequence in the context of simplied markov chain setting. These models can be further utilized in deep learning decoding stage, which will be illustrated in reinforcement learning, in the next episode. Full list of this series is listed below.

Problem as Markov Chain

In sequence-to-sequence problem, we are always faced with same problem of determining the best or most likely sequence of output. This kind of recurring problem exists extensively in algorithms, machine learning where we are given initial states and the dynamics of the system, and the goal is to find a path that is most likely. The corresponding concept, in science or mathematical discipline, is called Markov Chain.

Let describe the problem in the context of markov chain. Suppose there are \(n\) states, and initial state is given by $s_0 = [0.35, 0.25, 0.4] $. The transition matrix is defined by \(T\) where $ T[i][j]$ denotes the probability of transitioning from \(i\) to \(j\). Notice that each row sums to \(1.0\). \[ T= \begin{matrix} & \begin{matrix}0&1&2\end{matrix} \\\\ \begin{matrix}0\\\\1\\\\2\end{matrix} & \begin{bmatrix}0.3&0.6&0.1\\\\0.4&0.2&0.4\\\\0.3&0.3&0.4\end{bmatrix}\\\\ \end{matrix} \]

Probability of the next state \(s_1\) is derived by multiplication of \(s_0\) and \(T\), which can be visually interpreted by animation below.

The actual probability distribution value of \(s_1\) is computed numerically below. Recall that left multiplying a row with a matrix amounts to making a linear combination of that row vector.

\[ s_1 = \begin{bmatrix}0.35& 0.25& 0.4\end{bmatrix} \begin{matrix} \begin{bmatrix}0.3&0.6&0.1\\\\0.4&0.2&0.4\\\\0.3&0.3&0.4\end{bmatrix}\\\\ \end{matrix} = \begin{bmatrix}0.325& 0.35& 0.255\end{bmatrix} \] Again, state \(s_2\) can be derived in the same way \(s_1 \times T\), where we assume the transitioning dynamics remains the same. However, in deep learning problem, the dynamics usually depends on \(s_i\), or vary according to the stage.

Suppose there are only 3 stages in our problem, e.g., \(s_0 \rightarrow s_1 \rightarrow s_2\). Let \(L\) be the number of stages and \(N\) be the number of vertices in each stage. Hence \(L=N=3\) in our problem setting. There could be \(N^L\) different paths starting from initial stage and to final stage.

Let us compute an arbitrary path probability as an example, \(2(s_0) \rightarrow 1(s_1) \rightarrow 2(s_2)\). The total probability is

\[ p(2 \rightarrow 1 \rightarrow 2) = s_0[2] \times T[2][1] \times T[1][2] = 0.4 \times 0.3 \times 0.4 = 0.048 \]

First, we implement \(N^L\) exhaustive or brute force search.

The following Python 3 function returns one most likely sequence and its probability. Running the algorithm with our example produces 0.084 and route \(0 \rightarrow 1 \rightarrow 2\).

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def search_brute_force(initial: List, transition: List, L: int) -> Tuple[float, Tuple]:
from itertools import combinations_with_replacement
v = [0, 1, 2]
path_all = combinations_with_replacement(v, L)

max_prop = 0.0
max_route = None
prob = 0.0
for path in list(path_all):
for idx, v in enumerate(path):
if idx == 0:
prob = initial[v] # reset to initial state
else:
prev_v = path[idx-1]
prob *= transition[prev_v][v]
if prob > max_prop:
max_prop = max(max_prop, prob)
max_route = path
return max_prop, max_route

Exhaustive search always generates most likely sequence, as searching for a needle in the hay at the cost of exponential runtime complexity \(O(N^L)\). The simplest strategy, unknown as greedy, identifies one vertex in each stage and then expand the vertex in next stage. This strategy, of course, is not guaranteed to find most likely sequence but is fast. See animation below.

Code in Python 3 is given below. Numpy package is employed to utilize np.argmax() for code clarity. Notice there are 2 for loops (the other is np.argmax) so the runtime complexity is \(O(N\times L)\).

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def search_greedy(initial: List, transition: List, L: int) -> Tuple[float, Tuple]:
import numpy as np
max_route = []
max_prop = 0.0
states = np.array(initial)

prev_max_v = None
for l in range(0, L):
max_v = np.argmax(states)
max_route.append(max_v)
if l == 0:
max_prop = initial[max_v]
else:
max_prop = max_prop * transition[prev_max_v][max_v]
states = max_prop * states
prev_max_v = max_v

return max_prop, max_route

We could improve greedy strategy a little bit by expanding more vertices in each stage. In beam search with \(k\) nodes, the strategy is in each stage, identify \(k\) nodes with highest probability and expand these \(k\) nodes into next stage. In our example, \(k=2\), we select first 2 nodes in stage \(s_0\), expand these 2 nodes and evaluate \(2 \times 3\) nodes in stage \(s_1\), then select 2 nodes and evaluate 6 nodes in stage \(s_2\). Beam search, similar to greedy strategy, is not guaranteed to find most likely sequence but it extends search space with linear complexity.

Below is implementation in Python 3 with PriorityQueue to select top \(k\) nodes. Notice in order to use reverse order of PriorityQueue, a class with @total_ordering is required to be defined. The runtime complexity is \(O(k\times N \times L)\) .

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
def search_beam(initial: List, transition: List, L: int, K: int) -> Tuple[float, Tuple]:
N = len(initial)
from queue import PriorityQueue
current_q = PriorityQueue()
next_q = PriorityQueue()

from functools import total_ordering
@total_ordering
class PQItem(object):
def __init__(self, prob, route):
self.prob = prob
self.route = route
self.last_v = int(route[-1])

def __eq__(self, other):
return self.prob == other.prob

def __lt__(self, other):
return self.prob > other.prob

for v in range(N):
next_q.put(PQItem(initial[v], str(v)))

for l in range(1, L):
current_q = next_q
next_q = PriorityQueue()
k = K
while not current_q.empty() and k > 0:
item = current_q.get()
prob, route, prev_v = item.prob, item.route, item.last_v
k -= 1
for v in range(N):
nextItem = PQItem(prob * transition[prev_v][v], route + str(v))
next_q.put(nextItem)

max_item = next_q.get()

return max_item.prob, list(map(lambda x: int(x), max_item.route))

Viterbi DP

Similarly to TSP DP version, there is a dynamic programming approach, widely known as Viterbi algorithm, that always finds out sequence with max probability while reducing runtime complexity from \(O(N^L)\) to \(O(L \times N \times N)\) (corresponding to 3 loops in code below). The core idea is in each stage, an array keeps most likely sequence ending with each vertex and use the dp array as input to next stage. For example, let \(dp[1][0]\) be the most likely probability in \(s_1\) stage and end with vertex 0. \[ dp[1][0] = \max \\{s_0[0] \rightarrow s_1[0], s_0[1] \rightarrow s_1[0], s_0[2] \rightarrow s_1[0]\\} \]

Illustrative code that returns max probability but not route, in order to emphasize 3 loop pattern and max operation, honoring the essence of the algorithm.

{linenos
1
2
3
4
5
6
7
8
9
10
11
def search_dp(initial: List, transition: List, L: int) -> float:
N = len(initial)
dp = [[0.0 for c in range(N)] for r in range(L)]
dp[0] = initial[:]

for l in range(1, L):
for v in range(N):
for prev_v in range(N):
dp[l][v] = max(dp[l][v], dp[l - 1][prev_v] * transition[prev_v][v])

return max(dp[L-1])

Probabilistic Sampling

All algorithms described above are deterministic. However, in NLP deep learning decoding, deterministic property has disadvantage in that it may get trapped into repeated phrases or sentences. For example, paragraph like below is commonly generated:

1
This is the best of best of best of ...
One way to get out of loop is resorting to probabilistic sampling. For example, we can generate one vertex in each stage probabilistically according to their weights or according to total path probability.

For demonstration purpose, here is the code based on greedy strategy which probabilistically determines one node at each stage.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def search_prob_greedy(initial: List, transition: List, L: int) -> Tuple[float, Tuple]:
import random
N = len(initial)
max_route = []
max_prop = 0.0
vertices = [i for i in range(N)]
prob = initial[:]

for l in range(0, L):
v_lst = random.choices(vertices, prob)
v = v_lst[0]
max_route.append(v)
max_prop = prob[v]
prob = [prob[v] * transition[v][v_target] for v_target in range(N)]

return max_prop, max_route

This is third episode of series: TSP From DP to Deep Learning. In this episode, we will be entering the realm of deep learning, specifically, a type of sequence-to-sequence called Pointer Networks is introduced. It is tailored to solve problems like TSP or Convex Hull. Full list of this series is listed below.

Pointer Networks

In traditional sequence-to-sequence RNN, output classes depend on pre-defined size. For instance, a word generating RNN will utter one word from vocabulary of \(|V|\) size at each time step. However, there is large set of problems such as Convex Hull, Delaunay Triangulation and TSP, where range of the each output is not pre-defined, but of variable size, defined by the input. Pointer Networks overcame the constraint by selecting \(i\) -th input with probability derived from attention score.

Convex Hull

In following example, 10 points are given, the output is a sequence of points that bounds the set of all points. Each value in the output sequence is a integer ranging from 1 to 10, in this case, which is the value given by the concrete example. Generally, finding exact solution has been proven to be equivelent to sort problem, and has time complexity \(O(n*log(n))\).

image info

\[ \begin{align*} &\text{Input: } \mathcal{P} &=& \left\{P_{1}, \ldots, P_{10} \right\} \\ &\text{Output: } C^{\mathcal{P}} &=& \{2,4,3,5,6,7,2\} \end{align*} \]

TSP

TSP is almost identical to Convex Hull problem, though output sequence is of fixed length. In previous epsiode, we reduced from \(O(n!)\) to \(O(n^2*2^n)\).

image info

\[ \begin{align*} &\text{Input: } \mathcal{P} &= &\left\{P_{1}, \ldots, P_{6} \right\} \\ &\text{Output: } C^{\mathcal{P}} &=& \{1,3,2,4,5,6,1\} \end{align*} \]

Delaunay Triangulation

A Delaunay triangulation for a set of points in a plane is a triangulation such that each circumcircle of every triangle is empty, meaning no point from \(\mathcal{P}\) in its interior. This kind of problem outputs a sequence of sets, and each item in the set ranges from the input set \(\mathcal{P}\). image info

\[ \begin{align*} &\text{Input: } \mathcal{P} &=& \left\{P_{1}, \ldots, P_{5} \right\} \\ &\text{Output: } C^{\mathcal{P}} &=& \{(1,2,4),(1,4,5),(1,3,5),(1,2,3)\} \end{align*} \]

Sequence-to-Sequence Model

Suppose now n is fixed. given a training pair, \((\mathcal{P}, C^{\mathcal{P}})\), the vanilla sequence-to-sequence model parameterized by \(\theta\) computes the conditional probability.

\[ \begin{equation} p\left(\mathcal{C}^{\mathcal{P}} | \mathcal{P} ; \theta\right)=\prod_{i=1}^{m(\mathcal{P})} p\left(C_{i} | C_{1}, \ldots, C_{i-1}, \mathcal{P} ; \theta\right) \end{equation} \] The parameters of the model are learnt by maximizing the conditional probabilities for the training set, i.e. \[ \begin{equation} \theta^{*}=\underset{\theta}{\arg \max } \sum_{\mathcal{P}, \mathcal{C}^{\mathcal{P}}} \log p\left(\mathcal{C}^{\mathcal{P}} | \mathcal{P} ; \theta\right) \end{equation} \]

Content Based Input Attention

When attention is applied to vanilla sequence-to-sequence model, better result is obtained.

Let encoder and decoder states be $ (e_{1}, , e_{n}) $ and $ (d_{1}, , d_{m()}) $, respectively. At each output time \(i\), compute the attention vector \(d_i\) to be linear combination of $ (e_{1}, , e_{n}) $ with weights $ (a_{1}^{i}, , a_{n}^{i}) $ \[ d_{i} = \sum_{j=1}^{n} a_{j}^{i} e_{j} \]

$ (a_{1}^{i}, , a_{n}^{i}) $ is softmax value of $ (u_{1}^{i}, , u_{n}^{i}) $ and \(u_{j}^{i}\) can be considered as distance between \(d_{i}\) and \(e_{j}\). Notice that \(v\), \(W_1\), and \(W_2\) are learnable parameters of the model.

\[ \begin{eqnarray} u_{j}^{i} &=& v^{T} \tanh \left(W_{1} e_{j}+W_{2} d_\right) \quad j \in(1, \ldots, n) \\ a_{j}^{i} &=& \operatorname{softmax}\left(u_{j}^{i}\right) \quad j \in(1, \ldots, n) \end{eqnarray} \]

Pointer Networks

image info

Pointer Networks does not blend the encoder state \(e_j\) to propagate extra information to the decoder, but instead, use \(u^i_j\) as pointers to the input element.

\[ \begin{eqnarray*} u_{j}^{i} &=& v^{T} \tanh \left(W_{1} e_{j}+W_{2} d_{i}\right) \quad j \in(1, \ldots, n) \\ p\left(C_{i} | C_{1}, \ldots, C_{i-1}, \mathcal{P}\right) &=& \operatorname{softmax}\left(u^{i}\right) \end{eqnarray*} \]

More on Attention

In FloydHub Blog - Attention Mechanism , a clear and detailed explanation of difference and similarity between the classic first type of Attention, commonly referred to as Additive Attention by Dzmitry Bahdanau and second classic type, known as Multiplicative Attention and proposed by Thang Luong , is discussed.

It's well known that in Luong Attention, three ways of alignment scoring function is defined, or the distance between \(d_{i}\) and \(e_{j}\).

\[ \operatorname{score} \left( d_i, e_j \right)= \begin{cases} d_i^{\top} e_j & \text { dot } \\ d_i^{\top} W_a e_j & \text { general } \\ v_a^{\top} \tanh \left( W_a \left[ d_i ; e_j \right] \right) & \text { concat } \end{cases} \]

PyTorch Implementation

In episode 2, we have introduced TSP dataset where each case is a line, of following form.

1
x0, y0, x1, y1, ... output 1 v1 v2 v3 ... 1

PyTorch Dataset

Each case is converted to (input, input_len, output_in, output_out, output_len) of type nd.ndarray with appropriate padding and encapsulated in a extended PyTorch Dataset.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
from torch.utils.data import Dataset

class TSPDataset(Dataset):
"each data item of form (input, input_len, output_in, output_out, output_len)"
data: List[Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray, np.ndarray]]

def __len__(self):
return len(self.data)

def __getitem__(self, index):
input, input_len, output_in, output_out, output_len = self.data[index]
return input, input_len, output_in, output_out, output_len
image info

PyTorch pad_packed_sequence

Code in PyTorch seq-to-seq model typically utilizes pack_padded_sequence and pad_packed_sequence API to reduce computational cost. A detailed explanation is given here https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning#decoder-1.

image info
{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class RNNEncoder(nn.Module):
rnn: Union[nn.LSTM, nn.GRU, nn.RNN]

def __init__(self, rnn_type: str, bidirectional: bool, num_layers: int, input_size: int, hidden_size: int, dropout: float):
super(RNNEncoder, self).__init__()
if bidirectional:
assert hidden_size % 2 == 0
hidden_size = hidden_size // 2
self.rnn = rnn_init(rnn_type, input_size=input_size, hidden_size=hidden_size, bidirectional=bidirectional,num_layers=num_layers, dropout=dropout)

def forward(self, src: Tensor, src_lengths: Tensor, hidden: Tensor = None) -> Tuple[Tensor, Tensor]:
lengths = src_lengths.view(-1).tolist()
packed_src = pack_padded_sequence(src, lengths)
memory_bank, hidden_final = self.rnn(packed_src, hidden)
memory_bank = pad_packed_sequence(memory_bank)[0]
return memory_bank, hidden_final

Attention Code

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class Attention(nn.Module):
linear_out: nn.Linear

def __init__(self, dim: int):
super(Attention, self).__init__()
self.linear_out = nn.Linear(dim * 2, dim, bias=False)

def score(self, src: Tensor, target: Tensor) -> Tensor:
batch_size, src_len, dim = src.size()
_, target_len, _ = target.size()
target_ = target
src_ = src.transpose(1, 2)
return torch.bmm(target_, src_)

def forward(self, src: Tensor, target: Tensor, src_lengths: Tensor) -> Tuple[Tensor, Tensor]:
assert target.dim() == 3

batch_size, src_len, dim = src.size()
_, target_len, _ = target.size()

align_score = self.score(src, target)

mask = sequence_mask(src_lengths)
# (batch_size, max_len) -> (batch_size, 1, max_len)
mask = mask.unsqueeze(1)
align_score.data.masked_fill_(~mask, -float('inf'))
align_score = F.softmax(align_score, -1)

c = torch.bmm(align_score, src)

concat_c = torch.cat([c, target], -1)
attn_h = self.linear_out(concat_c)

return attn_h, align_score

Complete PyTorch implementation source code is also available on github.

References

This is second episode of series: TSP From DP to Deep Learning.

AIZU TSP Bottom Up Iterative DP

In last episode, we provided a top down recursive DP in Python 3 and Java 8. Now we continue to improve and convert it to bottom up iterative DP version. Below is a graph with 3 vertices, the top down recursive calls are completely drawn.

Looking from bottom up, we could identify corresponding topological computing order with ease. First, we compute all bit states with 3 ones, then 2 ones, then 1 one.

Pseudo Java code below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
for (int bitset_num = N; bitset_num >=0; bitset_num++) {
while(hasNextCombination(bitset_num)) {
int state = nextCombination(bitset_num);
// compute dp[state][v], v-th bit is set in state
for (int v = 0; v < n; v++) {
for (int u = 0; u < n; u++) {
// for each u not reached by this state
if (!include(state, u)) {
dp[state][v] = min(dp[state][v],
dp[new_state_include_u][u] + dist[v][u]);
}
}
}
}
}

For example, dp[00010][1] is the min distance starting from vertex 0, and just arriving at vertex 1: \(0 \rightarrow 1 \rightarrow ? \rightarrow ? \rightarrow ? \rightarrow 0\). In order to find out total min distance, we need to enumerate all possible u for first question mark. \[ (0 \rightarrow 1) + \begin{align*} \min \left\lbrace \begin{array}{r@{}l} 2 \rightarrow ? \rightarrow ? \rightarrow 0 + dist(1,2) \qquad\text{ new_state=[00110][2] } \qquad\\\\ 3 \rightarrow ? \rightarrow ? \rightarrow 0 + dist(1,3) \qquad\text{ new_state=[01010][3] } \qquad\\\\ 4 \rightarrow ? \rightarrow ? \rightarrow 0 + dist(1,4) \qquad\text{ new_state=[10010][4] } \qquad \end{array} \right. \end{align*} \]

Java Iterative DP Code

AC code in Python 3 and Java 8. Illustrate core Java code below.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public long solve() {
int N = g.V_NUM;
long[][] dp = new long[1 << N][N];
// init dp[][] with MAX
for (int i = 0; i < dp.length; i++) {
Arrays.fill(dp[i], Integer.MAX_VALUE);
}
dp[(1 << N) - 1][0] = 0;

for (int state = (1 << N) - 2; state >= 0; state--) {
for (int v = 0; v < N; v++) {
for (int u = 0; u < N; u++) {
if (((state >> u) & 1) == 0) {
dp[state][v] = Math.min(dp[state][v], dp[state | 1 << u][u] + g.edges[v][u]);
}
}
}
}
return dp[0][0] == Integer.MAX_VALUE ? -1 : dp[0][0];
}

In this way, runtime complexity can be spotted easily, three for loops leading to O(\(2^n * n * n\)) = O(\(2^n*n^2\) ).

DP on Euclidean Dataset

So far, TSP DP has been crystal clear and we move forward to introducing PTR_NET dataset on Google Drive by Oriol Vinyals who is the author of Pointer Networks. Each line in the dataset has the following pattern:

1
x0, y0, x1, y1, ... output 1 v1 v2 v3 ... 1

It first lists n points in (x, y) coordinate, followed by "output", then followed by one of the minimal distance tours, starting and ending with vertex 1 (indexed from 1 not 0).

Some examples of 10 vertices are:

1
2
3
4
0.607122 0.664447 0.953593 0.021519 0.757626 0.921024 0.586376 0.433565 0.786837 0.052959 0.016088 0.581436 0.496714 0.633571 0.227777 0.971433 0.665490 0.074331 0.383556 0.104392 output 1 3 8 6 10 9 5 2 4 7 1 
0.930534 0.747036 0.277412 0.938252 0.794592 0.794285 0.961946 0.261223 0.070796 0.384302 0.097035 0.796306 0.452332 0.412415 0.341413 0.566108 0.247172 0.890329 0.429978 0.232970 output 1 3 2 9 6 5 8 7 10 4 1
0.686712 0.087942 0.443054 0.277818 0.494769 0.985289 0.559706 0.861138 0.532884 0.351913 0.712561 0.199273 0.554681 0.657214 0.909986 0.277141 0.931064 0.639287 0.398927 0.406909 output 1 5 2 10 7 4 3 9 8 6 1

Plot first example using code below.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import matplotlib.pyplot as plt
points='0.607122 0.664447 0.953593 0.021519 0.757626 0.921024 0.586376 0.433565 0.786837 0.052959 0.016088 0.581436 0.496714 0.633571 0.227777 0.971433 0.665490 0.074331 0.383556 0.104392'
float_list = list(map(lambda x: float(x), points.split(' ')))

x,y = [],[]
for idx, p in enumerate(float_list):
if idx % 2 == 0:
x.append(p)
else:
y.append(p)

for i in range(0, len(x)):
for j in range(0, len(x)):
if i == j:
continue
plt.plot((x[i],x[j]),(y[i],y[j]))

plt.show()

TSP Case Fully Connected

Now plot the optimal tour: \[ 1 \rightarrow 3 \rightarrow 8 \rightarrow 6 \rightarrow 10 \rightarrow 9 \rightarrow 5 \rightarrow 2 \rightarrow 4 \rightarrow 7 \rightarrow 1 \]

{linenos
1
2
3
4
5
6
7
8
tour_str = '1 3 8 6 10 9 5 2 4 7 1'
tour = list(map(lambda x: int(x), tour_str.split(' ')))

for i in range(0, len(tour)-1):
p1 = tour[i] - 1
p2 = tour[i + 1] - 1
plt.plot((x[p1],x[p2]),(y[p1],y[p2]))
plt.show()
TSP Case Minimal Tour

Python Code Illustrated

Init Graph Edges

Based on previous top down version, several changes are made. First, we need to have an edge between every 2 vertices and due to our matrix representation of the directed edge, edges of 2 directions are initialized.

{linenos
1
2
3
4
5
6
7
8
g: Graph = Graph(N)
for v in range(N):
for u in range(N):
diff_x = coordinates[v][0] - coordinates[u][0]
diff_y = coordinates[v][1] - coordinates[u][1]
dist: float = math.sqrt(diff_x * diff_x + diff_y * diff_y)
g.setDist(u, v, dist)
g.setDist(v, u, dist)

Auxilliary Variable to Track Tour Vertices

One major enhancement is to record the optimal tour during enumerating. We introduce another variable parent[bitstate][v] to track next vertex u, with shortest path.

{linenos
1
2
3
4
5
6
7
8
9
10
11
ret: float = FLOAT_INF
u_min: int = -1
for u in range(self.g.v_num):
if (state & (1 << u)) == 0:
s: float = self._recurse(u, state | 1 << u)
if s + edges[v][u] < ret:
ret = s + edges[v][u]
u_min = u
dp[state][v] = ret
self.parent[state][v] = u_min

After minimal tour distance is found, one optimal tour is formed with the help of parent variable.

{linenos
1
2
3
4
5
6
7
8
9
def _form_tour(self):
self.tour = [0]
bit = 0
v = 0
for _ in range(self.g.v_num - 1):
v = self.parent[bit][v]
self.tour.append(v)
bit = bit | (1 << v)
self.tour.append(0)

Note that for each test case, only one tour is given after "output". Our code may form a different tour but it has same distance as what the dataset generates, which can be verified by following code snippet. See full code on github.

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
tsp: TSPSolver = TSPSolver(g)
tsp.solve()

output_dist: float = 0.0
output_tour = list(map(lambda x: int(x) - 1, output.split(' ')))
for v in range(1, len(output_tour)):
pre_v = output_tour[v-1]
curr_v = output_tour[v]
diff_x = coordinates[pre_v][0] - coordinates[curr_v][0]
diff_y = coordinates[pre_v][1] - coordinates[curr_v][1]
dist: float = math.sqrt(diff_x * diff_x + diff_y * diff_y)
output_dist += dist

passed = abs(tsp.dist - output_dist) < 10e-5
if passed:
print(f'passed dist={tsp.tour}')
else:
print(f'Min Tour Distance = {output_dist}, Computed Tour Distance = {tsp.dist}, Expected Tour = {output_tour}, Result = {tsp.tour}')

Travelling salesman problem (TSP) is a classic NP hard computer algorithmic problem. In this series, we will first solve TSP problem in an exact manner by ACing TSP on aizu with dynamic programming, and then move on to train a Pointer Network with Pytorch to obtain an approximate solution with deep learning and reinforcement learning technology. Complete episodes are listed as follows:

TSP Problem Review

TSP can be modelled as a graph problem where both directed and undirected graphs and both completely or partially connected graphs are applicable. The following picture in Wikipedia TSP is an undirected but complete TSP with four vertices, A, B, C, D. TSP requries a tour with minimal total distance, starting from arbitrarily picked vertex and ending with the same node while covering all vertices exactly once. For example, \(A \rightarrow B \rightarrow C \rightarrow D \rightarrow A\) and \(A \rightarrow C \rightarrow B \rightarrow D \rightarrow A\) are valid tours and among all tours there is only one minimal distance value (though multiple tours with same minimum may exist).

Wikipedia 4 Vertices Example
Despite different types of graphs, notice that we can always employ an adjacency matrix to represent a graph. The above graph can thus be represented by this matrix

\[ \begin{matrix} & \begin{matrix}A&B&C&D\end{matrix} \\ \begin{matrix}A\\B\\C\\D\end{matrix} & \begin{bmatrix}-&20&42&35\\20&-&30&34\\42&30&-&12\\35&34&12&-\end{bmatrix}\\ \end{matrix} \]

Of course, typically, TSP problem takes the form of n cooridanates in a plane, corresponding to complete and undirected graph, because in plane every pair of vertices has one connected edge and the edge has same distance in both directions.

AIZU TSP Online Judge

AIZU has a TSP problem where a directed and incomplete graph with V vertices and E directed edges is given, and the output expects minimal total distance. For example below having 4 vertices and 6 edges.

This test case has minimal tour distance 16, with corresponding tour being \(0\rightarrow1\rightarrow3\rightarrow2\rightarrow0\), as shown in red edges. However, the AIZU problem may not have a valid result because not every pair of vertices is guaranteed to be connected. In that case, -1 is required, which can also be interpreted as infinity.

Brute Force Solution

A naive way is to enumerate all possible routes starting from vertex 0 and keep minimal total distance ever generated. Python code below illustrates a 4 point vertices graph.

{linenos
1
2
3
4
5
from itertools import permutations
v = [1,2,3]
p = permutations(v)
for t in list(p):
print([0] + list(t) + [0])

The possible routes are

{linenos
1
2
3
4
5
6
[0, 1, 2, 3, 0]
[0, 1, 3, 2, 0]
[0, 2, 1, 3, 0]
[0, 2, 3, 1, 0]
[0, 3, 1, 2, 0]
[0, 3, 2, 1, 0]
This approach has a runtime complexity of O(\(n!\)), which won't pass AIZU.

Dynamic Programming

To AC AIZU TSP, we need to have acceleration of the factorial runtime complexity by using bitmask dynamic programming. First, let us map visited state to a binary value. In the 4 vertices case, it's "0110" if node 2 and 1 already visited and ending at node 1. Besides, we need to track current vertex to start from. So we extend dp from one dimension to two dimensions \(dp[bitstate][v]\). In the example, it's \(dp["0110"][1]\). The transition formula is given by \[ dp[bitstate][v] = \min ( dp[bitstate \cup \{u\}][u] + dist(v,u) \mid u \notin bitstate ) \]

The resulting time complexity is O(\(n^2*2^n\) ), since there are \(2^n * n\) total states and for each state one more round loop is needed. Factorial and exponential functions are significantly different.

\(n!\) \(n^2*2^n\)
n=8 40320 16384
n=10 3628800 102400
n=12 479001600 589824
n=14 87178291200 3211264

Pause a second and think about why bitmask DP works here. Notice there are lots of redundant sub calls, one of which is hightlighted in red ellipse below.

In this episode, a straightforward top down memoization DP version is given in Python 3 and Java 8. Benefit of top down DP approach is that we don't need to consider topological ordering when permuting all states. Notice that there is a trick in Java, where each element of dp is initialized as Integer.MAX_VALUE, so that only one statement is needed to update new dp value.

1
res = Math.min(res, s + g.edges[v][u]);
However, the code simplicity is at cost of clarity and care should be taken when dealing with actual INF (not reachable case). In python version, we could have used the same trick, perhaps by intializing with a large long value representing INF. But for clarity, we manually handle different cases in if-else statements and mark intial value as -1 (INT_INF).

1
2
3
4
5
6
7
INT_INF = -1

if s != INT_INF and edges[v][u] != INT_INF:
if ret == INT_INF:
ret = s + edges[v][u]
else:
ret = min(ret, s + edges[v][u])

Below is complete AC code in Python 3 and Java 8. Also can be downloaded on github.

AIZU Java 8 Recursive Version

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
// passed http://judge.u-aizu.ac.jp/onlinejudge/description.jsp?id=DPL_2_A
import java.util.Arrays;
import java.util.Scanner;

public class Main {
public static class Graph {
public final int V_NUM;
public final int[][] edges;

public Graph(int V_NUM) {
this.V_NUM = V_NUM;
this.edges = new int[V_NUM][V_NUM];
for (int i = 0; i < V_NUM; i++) {
Arrays.fill(this.edges[i], Integer.MAX_VALUE);
}
}

public void setDist(int src, int dest, int dist) {
this.edges[src][dest] = dist;
}

}

public static class TSP {
public final Graph g;
long[][] dp;

public TSP(Graph g) {
this.g = g;
}

public long solve() {
int N = g.V_NUM;
dp = new long[1 << N][N];
for (int i = 0; i < dp.length; i++) {
Arrays.fill(dp[i], -1);
}

long ret = recurse(0, 0);
return ret == Integer.MAX_VALUE ? -1 : ret;
}

private long recurse(int state, int v) {
int ALL = (1 << g.V_NUM) - 1;
if (dp[state][v] >= 0) {
return dp[state][v];
}
if (state == ALL && v == 0) {
dp[state][v] = 0;
return 0;
}
long res = Integer.MAX_VALUE;
for (int u = 0; u < g.V_NUM; u++) {
if ((state & (1 << u)) == 0) {
long s = recurse(state | 1 << u, u);
res = Math.min(res, s + g.edges[v][u]);
}
}
dp[state][v] = res;
return res;

}

}

public static void main(String[] args) {

Scanner in = new Scanner(System.in);
int V = in.nextInt();
int E = in.nextInt();
Graph g = new Graph(V);
while (E > 0) {
int src = in.nextInt();
int dest = in.nextInt();
int dist = in.nextInt();
g.setDist(src, dest, dist);
E--;
}
System.out.println(new TSP(g).solve());
}
}

AIZU Python 3 Recursive Version

{linenos
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
from typing import List

INT_INF = -1

class Graph:
v_num: int
edges: List[List[int]]

def __init__(self, v_num: int):
self.v_num = v_num
self.edges = [[INT_INF for c in range(v_num)] for r in range(v_num)]

def setDist(self, src: int, dest: int, dist: int):
self.edges[src][dest] = dist


class TSPSolver:
g: Graph
dp: List[List[int]]

def __init__(self, g: Graph):
self.g = g
self.dp = [[None for c in range(g.v_num)] for r in range(1 << g.v_num)]

def solve(self) -> int:
return self._recurse(0, 0)

def _recurse(self, v: int, state: int) -> int:
"""

:param v:
:param state:
:return: -1 means INF
"""
dp = self.dp
edges = self.g.edges

if dp[state][v] is not None:
return dp[state][v]

if (state == (1 << self.g.v_num) - 1) and (v == 0):
dp[state][v] = 0
return dp[state][v]

ret: int = INT_INF
for u in range(self.g.v_num):
if (state & (1 << u)) == 0:
s: int = self._recurse(u, state | 1 << u)
if s != INT_INF and edges[v][u] != INT_INF:
if ret == INT_INF:
ret = s + edges[v][u]
else:
ret = min(ret, s + edges[v][u])
dp[state][v] = ret
return ret


def main():
V, E = map(int, input().split())
g: Graph = Graph(V)
for _ in range(E):
src, dest, dist = map(int, input().split())
g.setDist(src, dest, dist)

tsp: TSPSolver = TSPSolver(g)
print(tsp.solve())


if __name__ == "__main__":
main()

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×