A Generic Graph-based Neural Architecture Encoding Scheme for Predictor-based NAS

[Abstract]

[1. Introduction]

two key components in a NAS framework
1. architecture searching module
2. architecture evaluation module
  - training every candidate architecture until convergence (thousands of architectures)
  - computational burden is large
  ⇒ directions
  
  1) evaluation: accelerate the evaluation of each architeucture (w.r.t. ranking correlation)
  
  2) searching: increasing sample efficiency so that fewer architectures are needed to evaluated
  - solution: learn approximated performance predictor → utilize the predictor to sample potentially good architectures
  - performance predictor predicts performance based on encodings
    - sequence-based scheme
    - graph-based scheme: GCN
      - for encoding operations, it is more natural to encoding them as transforms of node attrbutes (mimic the processing of the information) than just node attributes
introduce GATES
- models infornation as the attributes of input nodes
- data processing of the operations are modeled by GATES as different transforms
⇒ embeddings of isomorphic architectures are the same

[2. Related Work]

[2.1 Architeture Evaluation Module]

parameter sharing: construct supernet → one-shot
- effective, but not accurate and not generally applicable

[2.2 Architecture Searching Module]

[2.3 Neural Architecture Encoders]

[3. Method]

[3.1 Predictor-Based Neural Architecture Search]

Untitled

[3.2 GATES: A Generic Neural Architecture Encoder]

$\hat{s} = P(a)=MLP(Enc(a))$

Untitled

difference between GATES and GCN
- GATES models operations as the processing of the node attributes
- GCN models them as the node attributes themselves
representational power

1) reasonable modeling of DAGs

2) intrinsic power handling of DAG isomorphism

[3.3 Neural Architecture Search Utilizing the Predictor]

[4. Experiments]

[4.2 Predictor Evaluation on NAS-Bench-201]

15625 architectures in OOE search space
- the first 50%: training data
- remaning: test data