Count to everything 一种简单的目标计数方法,有很大的提升空间

本文介绍了一种新颖的方法,通过将标记的目标区域作为卷积核来提取图像特征,并采用卷积操作实现对图像中目标数量的回归预测。此方法能够有效地利用目标的形状信息。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Count to everything

1.代码的讲解

整体架构

1.backbone:Resnet50FPN

通过resnet 网络提取特征 feat ,使用 map3, map4 两层。
class Resnet50FPN(nn.Module):
    def __init__(self):
        super(Resnet50FPN, self).__init__()
        self.resnet = torchvision.models.resnet50(pretrained=True)
        children = list(self.resnet.children())
        self.conv1 = nn.Sequential(*children[:4])
        self.conv2 = children[4]
        self.conv3 = children[5]
        self.conv4 = children[6]
    def forward(self, im_data):
        feat = OrderedDict()
        feat_map = self.conv1(im_data)
        feat_map = self.conv2(feat_map)
        feat_map3 = self.conv3(feat_map)
        feat_map4 = self.conv4(feat_map3)
        feat['map3'] = feat_map3
        feat['map4'] = feat_map4
        return feat

2.巧妙利用“卷积核”进行特征提取

1.将三个特征框进行resize 相同的框
image,lines_boxes,density = sample['image'], sample['lines_boxes'],sample['gt_density']
        
        W, H = image.size
        if W > self.max_hw or H > self.max_hw:
            scale_factor = float(self.max_hw)/ max(H, W)
            new_H = 8*int(H*scale_factor/8)
            new_W = 8*int(W*scale_factor/8)
            resized_image = transforms.Resize((new_H, new_W))(image)
            resized_density = cv2.resize(density, (new_W, new_H))
            orig_count = np.sum(density)
            new_count = np.sum(resized_density)

            if new_count > 0: resized_density = resized_density * (orig_count / new_count)
            
        else:
            scale_factor = 1
            resized_image = image
            resized_density = density
        boxes = list()
        for box in lines_boxes:
            box2 = [int(k*scale_factor) for k in box]
            y1, x1, y2, x2 = box2[0], box2[1], box2[2], box2[3]
            boxes.append([0, y1,x1,y2,x2])

        boxes = torch.Tensor(boxes).unsqueeze(0)
        resized_image = Normalize(resized_image)
        resized_density = torch.from_numpy(resized_density).unsqueeze(0).unsqueeze(0)
        sample = {'image':resized_image,'boxes':boxes,'gt_density':resized_density}
        return sample
2.利用三个特征框当作“卷积核”进行特征提取
def extract_features(feature_model, image, boxes,feat_map_keys=['map3','map4'], exemplar_scales=[0.9, 1.1]):

    N, M = image.shape[0], boxes.shape[2]
    """
    Getting features for the image N * C * H * W
    """
    # ResNet 提取的特征图
    Image_features = feature_model(image)
    """
    Getting features for the examples (N*M) * C * h * w
    """
    for ix in range(0,N):
        # boxes = boxes.squeeze(0)
        boxes = boxes[ix][0]
        cnter = 0
        Cnter1 = 0
        for keys in feat_map_keys:
            image_features = Image_features[keys][ix].unsqueeze(0)
            if keys == 'map1' or keys == 'map2':
                Scaling = 4.0
            elif keys == 'map3':
                Scaling = 8.0
            elif keys == 'map4':
                Scaling =  16.0
            else:
                Scaling = 32.0
            # 将框下采样 boxes_scaled 得到整数的提取框
            boxes_scaled = boxes / Scaling
            boxes_scaled[:, 1:3] = torch.floor(boxes_scaled[:, 1:3])
            boxes_scaled[:, 3:5] = torch.ceil(boxes_scaled[:, 3:5])
            # +1 是为了微调
            boxes_scaled[:, 3:5] = boxes_scaled[:, 3:5] + 1 # make the end indices exclusive
            # 得到特征图的宽高
            feat_h, feat_w = image_features.shape[-2], image_features.shape[-1]
            # make sure exemplars don't go out of bound
            # 将框的边界 》=0,h w 在 feat_h feat_w的范围内
            boxes_scaled[:, 1:3] = torch.clamp_min(boxes_scaled[:, 1:3], 0)
            boxes_scaled[:, 3] = torch.clamp_max(boxes_scaled[:, 3], feat_h)
            boxes_scaled[:, 4] = torch.clamp_max(boxes_scaled[:, 4], feat_w)            
            box_hs = boxes_scaled[:, 3] - boxes_scaled[:, 1]
            box_ws = boxes_scaled[:, 4] - boxes_scaled[:, 2]            
            max_h = math.ceil(max(box_hs))
            max_w = math.ceil(max(box_ws))
            # M = 3
            for j in range(0,M):
                y1, x1 = int(boxes_scaled[j,1]), int(boxes_scaled[j,2])  
                y2, x2 = int(boxes_scaled[j,3]), int(boxes_scaled[j,4]) 
                #print(y1,y2,x1,x2,max_h,max_w)
                if j == 0:
                    # 得到框 所对应的 “卷积核”
                    examples_features = image_features[:,:,y1:y2, x1:x2]
                    # 如果“卷积核”不是最大的,就进行上采样操作,将三个 ”卷积核“ 同样大小
                    if examples_features.shape[2] != max_h or examples_features.shape[3] != max_w:
                        #examples_features = pad_to_size(examples_features, max_h, max_w)
                        examples_features = F.interpolate(examples_features, size=(max_h,max_w),mode='bilinear')                    
                else:
                    feat = image_features[:,:,y1:y2, x1:x2]
                    if feat.shape[2] != max_h or feat.shape[3] != max_w:
                        feat = F.interpolate(feat, size=(max_h,max_w),mode='bilinear')
                        #feat = pad_to_size(feat, max_h, max_w)
                    # 将三个 ”卷积核“ 进行拼接操作。
                    examples_features = torch.cat((examples_features,feat),dim=0)
            """
            Convolving example features over image features
            """
            # 利用三个”卷积核“ 进行卷积操作
            h, w = examples_features.shape[2], examples_features.shape[3]
            features =    F.conv2d(
                    F.pad(image_features, ((int(w/2)), int((w-1)/2), int(h/2), int((h-1)/2))),
                    examples_features
                )
            combined = features.permute([1,0,2,3])
            # computing features for scales 0.9 and 1.1 
            for scale in exemplar_scales:
                    h1 = math.ceil(h * scale)
                    w1 = math.ceil(w * scale)
                    if h1 < 1: # use original size if scaled size is too small
                        h1 = h
                    if w1 < 1:
                        w1 = w
                    # 进行微调 0.9 1.1 进行卷积核的提取
                    examples_features_scaled = F.interpolate(examples_features, size=(h1,w1),mode='bilinear')  
                    features_scaled =    F.conv2d(F.pad(image_features, ((int(w1/2)), int((w1-1)/2), int(h1/2), int((h1-1)/2))),
                    examples_features_scaled)
                    features_scaled = features_scaled.permute([1,0,2,3])
                    # 进行拼接操作
                    combined = torch.cat((combined,features_scaled),dim=1)
            if cnter == 0:
                # 结合
                Combined = 1.0 * combined
            else:
                if Combined.shape[2] != combined.shape[2] or Combined.shape[3] != combined.shape[3]:
                    # 两个特征图 map3,和map4  所以结合需要上采样。
                    combined = F.interpolate(combined, size=(Combined.shape[2],Combined.shape[3]),mode='bilinear')
                Combined = torch.cat((Combined,combined),dim=1)
            cnter += 1
        # 这里是多个N(三个提取框进行) 进行拼接
        if ix == 0:
            All_feat = 1.0 * Combined.unsqueeze(0)
        else:
            All_feat = torch.cat((All_feat,Combined.unsqueeze(0)),dim=0)

    return All_feat

3.使用卷积进行预测数量

class CountRegressor(nn.Module):
    def __init__(self, input_channels,pool='mean'):
        super(CountRegressor, self).__init__()
        self.pool = pool
        self.regressor = nn.Sequential(
            nn.Conv2d(input_channels, 196, 7, padding=3),
            nn.ReLU(),
            nn.UpsamplingBilinear2d(scale_factor=2),
            nn.Conv2d(196, 128, 5, padding=2),
            nn.ReLU(),
            nn.UpsamplingBilinear2d(scale_factor=2),
            nn.Conv2d(128, 64, 3, padding=1),
            nn.ReLU(),
            nn.UpsamplingBilinear2d(scale_factor=2),
            nn.Conv2d(64, 32, 1),
            nn.ReLU(),
            nn.Conv2d(32, 1, 1),
            nn.ReLU(),
        )

    def forward(self, im):
        num_sample =  im.shape[0]
        if num_sample == 1:
            output = self.regressor(im.squeeze(0))
            if self.pool == 'mean':
                output = torch.mean(output, dim=(0),keepdim=True)  
                return output
            elif self.pool == 'max':
                output, _ = torch.max(output, 0,keepdim=True)
                return output
        else:
            for i in range(0,num_sample):
                output = self.regressor(im[i])
                if self.pool == 'mean':
                    output = torch.mean(output, dim=(0),keepdim=True)
                elif self.pool == 'max':
                    output, _ = torch.max(output, 0,keepdim=True)
                if i == 0:
                    Output = output
                else:
                    Output = torch.cat((Output,output),dim=0)
            return Output

2.总结

1.很巧妙的利用所标记的框,当作卷积核 进行特征提取
2.使用卷积操作进行回归预测总体数量
3.但是存在很多不足的地方,如果🐟的方向或者说物体的方向不一样,预测可能就有很大误差。
//#pragma GCC optimize(2,3,“Ofast”,“inline”, “-ffast-math”) //#pragma GCC target(“avx,sse2,sse3,sse4,mmx”) #include #include #include #include #include #include #include #include #include<unordered_map> #include #include #include #include #include #include #include #define fi first #define se second #define pb push_back #define y1 hsduaishxu #define mkp make_pair using namespace std; typedef long long ll; typedef long double ld; typedef unsigned long long ull; typedef pair<int,int> pii; typedef pair<ll,int> pli; typedef pair<int,ll> pil; typedef pair<ll,ll> pll; typedef unsigned int uint; typedef vector vpii; typedef int128 i128; const int maxn=1000005; const ll mod=1000000007; inline int Min(int x,int y){return x<y?x:y;} inline int Max(int x,int y){return x>y?x:y;} inline ll Min(ll x,ll y){return x<y?x:y;} inline ll Max(ll x,ll y){return x>y?x:y;} inline void ad(int &x,int y,int z){x=y+z;if(x>=mod) x-=mod;} inline void ad(ll &x,ll y,ll z){x=y+z;if(x>=mod) x-=mod;} inline void ad(int &x,int y){x+=y;if(x>=mod) x-=mod;} inline void ad(int &x,ll y){x+=y;if(x>=mod) x-=mod;} inline void ad(ll &x,ll y){x+=y;if(x>=mod) x-=mod;} inline void siu(int &x,int y,int z){x=y-z;if(x<0) x+=mod;} inline void siu(int &x,int y){x-=y;if(x<0) x+=mod;} inline void siu(ll &x,ll y){x-=y;if(x<0) x+=mod;} inline ll myabs(ll x){return x<0?-x:x;} inline void tmn(int &x,int y){if(y<x) x=y;} inline void tmx(int &x,int y){if(y>x) x=y;} inline void tmn(ll &x,ll y){if(y<x) x=y;} inline void tmx(ll &x,ll y){if(y>x) x=y;} ll qpow(ll aa,ll bb){ll res=1;while(bb){if(bb&1) res=resaa%mod;aa=aaaa%mod;bb>>=1;}return res;} ll qpow(ll aa,ll bb,ll md){ll res=1;while(bb){if(bb&1) res=(i128)resaa%md;aa=(i128)aaaa%md;bb>>=1;}return res;} inline ll Inv(ll x,ll md){return qpow(x,md-2,md);} inline ll Inv(ll x){return qpow(x,mod-2);} int ,; int n,k; int p[maxn],q[maxn]; ll ans; vector g[maxn],h[maxn]; int r1,r2; int siz[maxn],dfn[maxn],dfscnt; void dfs(int u) { siz[u]=1;dfn[u]=dfscnt; for(auto v:g[u]) dfs(v),siz[u]+=siz[v]; } struct bit { int c[maxn]; void clr() { for(int i=1;i<=n;i) c[i]=0; } int lowbit(int x){return x&(-x);} void ad(int x,int k){while(x<=n){c[x]+=k;x+=lowbit(x);}} int qry(int x){int res=0;while(x>=1){res+=c[x];x-=lowbit(x);}return res;} }T1,T2; int F[maxn],st[maxn],tp,sz[maxn],son[maxn]; vector e[maxn]; void dfs1(int u) { st[tp]=u;e[u].clear(); if(tp>k) F[u]=st[tp-k]; else F[u]=0; if(F[u]) e[F[u]].push_back(u); sz[u]=1;son[u]=0; for(auto v:h[u]) { dfs1(v); if(sz[v]>sz[son[u]]) son[u]=v; sz[u]+=sz[v]; } tp–; } void ins(int x,int k) { T1.ad(dfn[x],k);T1.ad(dfn[x]+siz[x],-k); T2.ad(dfn[x],k); } int qry(int x) { return T1.qry(dfn[x])+T2.qry(dfn[x]+siz[x]-1)-T2.qry(dfn[x]-1); } void dfs3(int u,int ty) { for(auto v:e[u]) { if(ty1) ans+=qry(v); else if(ty3) ins(v,-1); else if(ty==2) ins(v,1); } for(auto v:h[u]) dfs3(v,ty); } void dfs2(int u,int ty) { for(auto v:h[u]) if(v!=son[u]) dfs2(v,0); if(son[u]) { dfs2(son[u],1); for(auto v:h[u]) if(v!=son[u]) dfs3(v,1),dfs3(v,2); } for(auto v:e[u]) ins(v,1); if(!ty) dfs3(u,3); } void cal() { dfscnt=0;dfs(r1); T1.clr();T2.clr(); tp=0;dfs1(r2); dfs2(r2,0); } void solve() { cin>>n>>k; for(int i=1;i<=n;i) cin>>p[i]; for(int i=1;i<=n;i++) cin>>q[i]; for(int i=1;i<=n;i++) { if(!p[i]) r1=i; else g[p[i]].push_back(i); if(!q[i]) r2=i; else h[q[i]].push_back(i); } cal(); for(int i=1;i<=n;i++) swap(p[i],q[i]),swap(g[i],h[i]);swap(r1,r2); cal(); cout<<ans<<“\n”; } signed main() { freopen(“D.in”,“r”,stdin); freopen(“D.out”,“w”,stdout); ios::sync_with_stdio(false);cin.tie(0);cout.tie(0); =1; //cin>>; while(–) { solve(); } return 0; } //by cristiano ronaldo dos santos aveiro #include <bits/stdc++.h> using namespace std; typedef long long ll; #define rep(i,s,t) for(register ll i = s;i <= t;++i) #define per(i,t,s) for(register ll i = t;i >= s;–i) const ll N = 1e6 + 5; ll n; ll k; ll rt1; ll rt2; ll top; ll idx; ll ans; ll p[N] = {}; ll q[N] = {}; ll fa[N] = {}; ll st[N] = {}; ll sz[N] = {}; ll siz[N] = {}; ll dfn[N] = {}; ll son[N] = {}; vector g[N]; vector g1[N]; vector g2[N]; class binary_indexed_tree { private: ll t[N] = {}; public: inline void init() { memset(t,0,sizeof(t)); } inline ll lowbit(ll x) { return x & (-x); } inline void upd(ll x,ll k) { while(x <= n) { t[x] += k; x += lowbit(x); } } inline ll qry(ll x) { ll ans = 0; while(x) { ans += t[x]; x -= lowbit(x); } return ans; } }; binary_indexed_tree t1; binary_indexed_tree t2; inline ll read() { ll x = 0; ll y = 1; char c = getchar(); while(c < ‘0’ || c > ‘9’) { if(c == ‘-’) y = -y; c = getchar(); } while(c >= ‘0’ && c <= ‘9’) { x = (x << 3) + (x << 1) + (c ^ ‘0’); c = getchar(); } return x * y; } inline void write(ll x) { if(x < 0) { putchar(‘-’); write(-x); return; } if(x > 9) write(x / 10); putchar(x % 10 + ‘0’); } inline void dfs(ll u) { siz[u] = 1; dfn[u] = ++idx; for(register auto v : g1[u]) { dfs(v); siz[u] += siz[v]; } } inline void dfs1(ll u) { st[++top] = u; g[u].clear(); if(top > k) fa[u] = st[top - k]; else fa[u] = 0; if(fa[u]) g[fa[u]].push_back(u); sz[u] = 1; son[u] = 0; for(auto v : g2[u]) { dfs1(v); if(sz[v] > sz[son[u]]) son[u] = v; sz[u] += sz[v]; } top–; } inline void ins(ll x,ll k) { t1.upd(dfn[x],k); t1.upd(dfn[x] + siz[x],-k); t2.upd(dfn[x],k); } inline ll query(ll x) { return t1.qry(dfn[x]) + t2.qry(dfn[x] + siz[x] - 1) - t2.qry(dfn[x] - 1); } inline void dfs3(ll u,ll k) { for(auto v : g[u]) { if(k == 1) ans += query(v); else if(k == 2) ins(v,1); else if(k == 3) ins(v,-1); } for(auto v : g2[u]) dfs3(v,k); } inline void dfs2(ll u,ll k) { for(auto v : g2[u]) if(v != son[u]) dfs2(v,0); if(son[u]) { dfs2(son[u],1); for(auto v : g2[u]) { if(v != son[u]) { dfs3(v,1); dfs3(v,2); } } } for(register auto v : g[u]) ins(v,1); if(!k) dfs3(u,3); } int main() { freopen(“D.in”,“r”,stdin); freopen(“D.out”,“w”,stdout); n = read(); k = read(); rep(i,1,n) p[i] = read(); rep(i,1,n) q[i] = read(); rep(i,1,n) { if(!p[i]) rt1 = i; else g1[p[i]].push_back(i); if(!q[i]) rt2 = i; else g2[q[i]].push_back(i); } idx = 0; dfs(rt1); t1.init(); t2.init(); top = 0; dfs1(rt2); dfs2(rt2,0); rep(i,1,n) { swap(p[i],q[i]); swap(g1[i],g2[i]); swap(rt1,rt2); } idx = 0; dfs(rt1); t1.init(); t2.init(); top = 0; dfs1(rt2); dfs2(rt2,0); write(ans); fclose(stdin); fclose(stdout); return 0; }针对以下问题,上述两段代码的功能有什么不同,请指出并修正第二段代码,使得第二段代码功能与第一段代码功能完全等价小丁的树 题目描述 小丁拥有两棵均具有 n n 个顶点,编号集合为 { 1 , 2 , ⋯   , n } {1,2,⋯,n} 的有根树 T 1 , T 2 T 1 ​ ,T 2 ​ ,现在他需要计算这两棵树的相似程度。 为了计算,小丁定义了对于一棵树 T T 和 T T 上两个不同顶点 u , v u,v 的距离函数 d T ( u , v ) d T ​ (u,v),其定义为 u , v u,v 两个点距离成为祖先关系有多近,具体来说,对于所有在 T T 上为祖先关系的点对 ( u ′ , v ′ ) (u ′ ,v ′ ), dis ⁡ ( u , u ′ ) + dis ⁡ ( v , v ′ ) dis(u,u ′ )+dis(v,v ′ ) 的最小值即为 d T ( u , v ) d T ​ (u,v) 的值,其中 dis ⁡ ( u , v ) dis(u,v) 表示 u , v u,v 在树 T T 上的唯一简单路径包含的边数,即 u , v u,v 的距离。 点对 ( u ′ , v ′ ) (u ′ ,v ′ ) 为祖先关系,当且仅当 u ′ u ′ 是 v ′ v ′ 的祖先或 v ′ v ′ 是 u ′ u ′ 的祖先。(注意,每个点都是自己的祖先) 小丁心里还有一个参数 k k,如果节点对 ( u , v ) (u,v) 满足以下条件,称之为不相似的节点对: 1 ≤ u < v ≤ n 1≤u<v≤n " d T 1 ( u , v ) 0 d T 1 ​ ​ (u,v)=0 且 d T 2 ( u , v ) k d T 2 ​ ​ (u,v)>k“ 或 " d T 2 ( u , v ) 0 d T 2 ​ ​ (u,v)=0 且 d T 1 ( u , v ) k d T 1 ​ ​ (u,v)>k​“ 小丁认为,不相似的节点对越多, T 1 T 1 ​ 和 T 2 T 2 ​ 就越不相似,你能告诉他总共有多少不相似的节点对吗? 输入格式 第一行两个整数 n , k n,k,表示 T 1 T 1 ​ 和 T 2 T 2 ​ 的节点数和参数 k k。 第二行 n n 个正整数 p 1 , p 2 , ⋯   , p n p 1 ​ ,p 2 ​ ,⋯,p n ​ , T 1 T 1 ​ 中节点 i i 的父节点为 p i p i ​ ,特别的,若 p i 0 p i ​ =0,则 i i 是 T 1 T 1 ​ 的根。 第三行 n n 个正整数 q 1 , q 2 , ⋯   , q n q 1 ​ ,q 2 ​ ,⋯,q n ​ , T 2 T 2 ​ 中节点 i i 的父节点为 q i q i ​ ,特别的,若 q i 0 q i ​ =0,则 i i 是 T 2 T 2 ​ 的根。 输出格式 一行一个整数,表示不相似的节点对总数。 样例 1 输入 5 0 0 1 1 2 3 5 3 1 1 0 样例 1 输出 4 样例 1 解释 ( 2 , 3 ) , ( 2 , 4 ) , ( 2 , 5 ) , ( 4 , 5 ) (2,3),(2,4),(2,5),(4,5) 为不相似的节点对。 其余样例见下发文件。 数据规模与约定 对于所有数据, 1 ≤ n ≤ 2 × 10 5 , 0 ≤ k < n , 0 ≤ p i , q i ≤ n 1≤n≤2×10 5 ,0≤k<n,0≤p i ​ ,q i ​ ≤n,且由 p i , q i p i ​ ,q i ​ 形成的是一棵 n n 个节点的有根树。 本题采用捆绑评测,你只有通过了一个子任务中所有测试点才能得到该子任务的分数。 Subtask 1(10pts): 1 ≤ n ≤ 100 1≤n≤100。 Subtask 2(20pts): 1 ≤ n ≤ 3000 1≤n≤3000。 Subtask 3(20pts): k 0 k=0。 Subtask 4(10pts): 0 ≤ k ≤ 20 0≤k≤20。 Subtask 5(40pts):无特殊限制。
最新发布
07-25
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值