联系人聚合相关类

本文详细介绍了Android系统中联系人匹配算法的实现细节,包括MatchScore类如何计算联系人匹配分数,NameDistance类如何衡量名字间的相似度,以及ContactMatcher类如何综合各项信息决定联系人是否聚合。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

MatchScore

packages/providers/ContactsProvider/src/com/android/providers/contacts/aggregation/util/MatchScore.java

这个类用于记录每个联系人匹配分数,自动聚合的时候依据这个来选取候选对象。

public class MatchScore implements Comparable<MatchScore> {
    // Scores a multiplied by this number to allow room for "fractional" scores
    public static final int SCORE_SCALE = 1000; //分数所占系数
    // Best possible match score
    public static final int MAX_SCORE = 100; //最大分数

    private long mRawContactId; //联系人相关信息
    private long mContactId;
    private long mAccountId;

    private boolean mKeepIn; //是否匹配
    private boolean mKeepOut;

    private int mPrimaryScore; //首要分数,名字匹配分数
    private int mSecondaryScore; //次要分数,号码等其它信息匹配分数
    private int mMatchCount;  //每次更新分数后值加1,最终也会用于计算匹配分数
    ...
}
接下俩是显示成员变量关系的方法:

    public int getScore() {
        if (mKeepOut) {
            return 0; //不匹配直接返回0
        }

        if (mKeepIn) {
            return MAX_SCORE; //匹配直接返回100
        }

        int score = (mPrimaryScore > mSecondaryScore ? mPrimaryScore : mSecondaryScore); //选取最大的score

        // Ensure that of two contacts with the same match score the one with more matching
        // data elements wins.
        return score * SCORE_SCALE + mMatchCount; //系数是1000.可见mMatchCount占的比例很小
    }
比较方法

    @Override
    public int compareTo(MatchScore another) {
        return another.getScore() - getScore(); //比较分数值
    }

NameDistance

packages/providers/ContactsProvider/src/com/android/providers/contacts/aggregation/util/NameDistance.java

这个类的方法就一个

public float getDistance(byte bytes1[], byte bytes2[]) 
返回两个名字的距离,注意参数都是byte格式的。这个距离的定义见 匹配算法Jaro–Winkler distance简介

ContactMatcher

packages/providers/ContactsProvider/src/com/android/providers/contacts/aggregation/util/ContactMatcher.java

联系人匹配分数计算

常量

    // Suggest to aggregate contacts if their match score is equal or greater than this threshold
    public static final int SCORE_THRESHOLD_SUGGEST = 50; 

    // Automatically aggregate contacts if their match score is equal or greater than this threshold
    public static final int SCORE_THRESHOLD_PRIMARY = 70;

    // Automatically aggregate contacts if the match score is equal or greater than this threshold
    // and there is a secondary match (phone number, email etc).
    public static final int SCORE_THRESHOLD_SECONDARY = 50; 
三个常量,确定联系人匹配程度的阀值,值越低匹配的程度越低。还有其它的常量:

    private static final int NO_DATA_SCORE = -1; //不匹配的分数
    private static final int PHONE_MATCH_SCORE = 71; //号码匹配分数
    private static final int EMAIL_MATCH_SCORE = 71; //邮件匹配分数
    private static final int NICKNAME_MATCH_SCORE = 71; //昵称匹配分数
    private static final int MAX_MATCHED_NAME_LENGTH = 30; //最大匹配联系人数量
匹配算法常量,匹配程度由高到低:

    public static final int MATCHING_ALGORITHM_EXACT = 0; //完全匹配
    public static final int MATCHING_ALGORITHM_CONSERVATIVE = 1; //保守匹配
    public static final int MATCHING_ALGORITHM_APPROXIMATE = 2; //相近匹配
最后是

    public static final float APPROXIMATE_MATCH_THRESHOLD = 0.82f; //名字距离阀值
    public static final float APPROXIMATE_MATCH_THRESHOLD_FOR_EMAIL = 0.95f; //邮件地址距离阀值

成员

两个分数矩阵,确认了不同名字类型之间的匹配分数范围。例如名字类型是完全匹配,且找到了该匹配,那么分数是99;如果名字类型是完全匹配,却和另外一个联系人的昵称匹配,那么分数为50。名字有五种类型,见 联系人存储ContactsProvider表分析中的NAME_LOOKUP小节。
    private static int[] sMinScore =
            new int[NameLookupType.TYPE_COUNT * NameLookupType.TYPE_COUNT]; //最低分
    private static int[] sMaxScore =
            new int[NameLookupType.TYPE_COUNT * NameLookupType.TYPE_COUNT]; //最高分
在静态块中初始化:

    static {
        setScoreRange(NameLookupType.NAME_EXACT,
                NameLookupType.NAME_EXACT, 99, 99);
        setScoreRange(NameLookupType.NAME_VARIANT,
                NameLookupType.NAME_VARIANT, 90, 90);
        setScoreRange(NameLookupType.NAME_COLLATION_KEY,
                NameLookupType.NAME_COLLATION_KEY, 50, 80);

        ...

        setScoreRange(NameLookupType.NICKNAME,
                NameLookupType.NICKNAME, 50, 60);
        setScoreRange(NameLookupType.NICKNAME,
                NameLookupType.NAME_COLLATION_KEY, 50, 60);
        setScoreRange(NameLookupType.NICKNAME,
                NameLookupType.EMAIL_BASED_NICKNAME, 50, 60);
    }
setCcoreRange方法,依据两中类型计算索引,并赋值,将一维数组当二维用:

    private static void setScoreRange(int candidateNameType, int nameType, int scoreFrom, int scoreTo) {
        int index = nameType * NameLookupType.TYPE_COUNT + candidateNameType;
        sMinScore[index] = scoreFrom;
        sMaxScore[index] = scoreTo;
    }
看MatchScore的集合成员:

    private final HashMap<Long, MatchScore> mScores = new HashMap<Long, MatchScore>();  //依据contact id获取MatchScore
    private final ArrayList<MatchScore> mScoreList = new ArrayList<MatchScore>(); //MatchScore列表
    private int mScoreCount = 0; //匹配的分数个数,可能小于mScoreList的size
getMatchingScore方法展现了它们的关系

    private MatchScore getMatchingScore(long contactId) {
        MatchScore matchingScore = mScores.get(contactId); //先从缓存中获取
        if (matchingScore == null) {  //没有的话开始创建新的MatchScore
            if (mScoreList.size() > mScoreCount) {  列表数目大于mScoreCount,则取一个元素并初始化
                matchingScore = mScoreList.get(mScoreCount);
                matchingScore.reset(contactId);
            } else {
                matchingScore = new MatchScore(contactId); //创建新的对象
                mScoreList.add(matchingScore);
            }
            mScoreCount++; //创建一个新的MatchScore后数量加1
            mScores.put(contactId, matchingScore); //放置对应contact id的MatchScore
        }
        return matchingScore;
    }

方法

  public void updateScoreWithPhoneNumberMatch(long contactId) {
        updateSecondaryScore(contactId, PHONE_MATCH_SCORE);
    }

    public void updateScoreWithEmailMatch(long contactId) {
        updateSecondaryScore(contactId, EMAIL_MATCH_SCORE);
    }

    public void updateScoreWithNicknameMatch(long contactId) {
        updateSecondaryScore(contactId, NICKNAME_MATCH_SCORE);
    }

    private void updatePrimaryScore(long contactId, int score) {
        getMatchingScore(contactId).updatePrimaryScore(score);
    }

    private void updateSecondaryScore(long contactId, int score) {
        getMatchingScore(contactId).updateSecondaryScore(score);
    }

    public void keepIn(long contactId) {
        getMatchingScore(contactId).keepIn();
    }

    public void keepOut(long contactId) {
        getMatchingScore(contactId).keepOut();
    }
一系列的更新MatchScore的方法

 public void matchName(long contactId, int candidateNameType, String candidateName,
            int nameType, String name, int algorithm) {
        int maxScore = getMaxScore(candidateNameType, nameType); //分数矩阵转换分数是0分,无需继续匹配
        if (maxScore == 0) {
            return;
        }

        if (candidateName.equals(name)) { //名字完全匹配
            updatePrimaryScore(contactId, maxScore);
            return;
        }

        if (algorithm == MATCHING_ALGORITHM_EXACT) { //算法是完全匹配,无需继续进行
            return;
        }

        int minScore = getMinScore(candidateNameType, nameType);
        if (minScore == maxScore) {//最小分和最大分相等,无需进行
            return;
        }

        final byte[] decodedCandidateName;
        final byte[] decodedName;
        try {
            decodedCandidateName = Hex.decodeHex(candidateName); //转换成byte,以供后续计算
            decodedName = Hex.decodeHex(name);
        } catch (RuntimeException e) {
            // How could this happen??  See bug 6827136
            Log.e(TAG, "Failed to decode normalized name.  Skipping.", e);
            return;
        }

        NameDistance nameDistance = algorithm == MATCHING_ALGORITHM_CONSERVATIVE ?
                mNameDistanceConservative : mNameDistanceApproximate;

        int score;
        float distance = nameDistance.getDistance(decodedCandidateName, decodedName); //计算名字举例
        boolean emailBased = candidateNameType == NameLookupType.EMAIL_BASED_NICKNAME
                || nameType == NameLookupType.EMAIL_BASED_NICKNAME;
        float threshold = emailBased
                ? APPROXIMATE_MATCH_THRESHOLD_FOR_EMAIL
                : APPROXIMATE_MATCH_THRESHOLD;
        if (distance > threshold) {
            score = (int)(minScore +  (maxScore - minScore) * (1.0f - distance)); //计算分数
        } else {
            score = 0;
        }

        updatePrimaryScore(contactId, score); //更新主要分数
    }
matchName是计算名字匹配分数方法

public List<Long> prepareSecondaryMatchCandidates(int threshold)
返回符合次要分数符合要求的联系人id列表

public List<MatchScore> pickBestMatches(int threshold) 
返回符合要求的MatchScore列表

public long pickBestMatch(int threshold, boolean allowMultipleMatches) 
返回最匹配的联系人id,就是分数最高的那个,如果有相同的最高分,则依据allowMultipleMatches返回不同的值

RawContactMatcher

packages/providers/ContactsProvider/src/com/android/providers/contacts/aggregation/util/RawContactMatcher.java

和ContactMatcher基本一样,最大的改动是方法参数中基本都加了account id。这个类只有ContactAggregator2类在使用,ContactMatcher是ContactAggregator使用。新版本的聚合是用ContactAggregator2。ContactsProvider2中依据算法版本号值不同创建不同的对象:

private void initForDefaultLocale() {
    ...
        PROPERTY_AGGREGATION_ALGORITHM_VERSION = (value == 0)
                ? AGGREGATION_ALGORITHM_OLD_VERSION
                : AGGREGATION_ALGORITHM_NEW_VERSION;
        mContactAggregator = (value == 0)
                ? new ContactAggregator(this, mContactsHelper,
                        createPhotoPriorityResolver(context), mNameSplitter, mCommonNicknameCache)
                : new ContactAggregator2(this, mContactsHelper,
                        createPhotoPriorityResolver(context), mNameSplitter, mCommonNicknameCache);
    ...
}
目前是新版本号,也就是使用ContactAggregator2。

RawContactMatchingCandidates

packages/providers/ContactsProvider/src/com/android/providers/contacts/aggregation/util/RawContactMatchingCandidates.java

该类就是保存匹配联系人的一些相关数据,ContactAggregator2中用到

    private List<MatchScore> mBestMatches; //保存MatchScore
    private Set<Long> mRawContactIds = null; //全部contact id
    private Map<Long, Long> mRawContactToContact = null; //RawContact id对应的Contact id
    private Map<Long, Long> mRawContactToAccount = null; //RawContact id对应的Account id

CommonNicknameCache

packages/providers/ContactsProvider/src/com/android/providers/contacts/aggregation/util/CommonNicknameCache.java
使用NICKNAME_LOOKUP表的封装,见 联系人存储ContactsProvider表分析中的NICKNAME_LOOKUP表,例如 "Robert", "Bob" and "Rob"这三个属于同一个CLUSTER,表示相同的名字,只不过写法不同(西方语言是表音文字,同一读音有多个写法,中文就无此问题了)。注意这个和data表中mime类型为Nickname的数据是两回事。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值