0. 关键概念

关键概念

Concepts Function
Topic 用于划分Message的逻辑概念,一个Topic可以分布在多个Broker上。
Partition 是Kafka中横向扩展和一切并行化的基础,每个Topic都至少被切分为1个Partition。
Offset 消息在Partition中的编号,编号顺序不跨Partition(在Partition内有序)。
Consumer 用于从Broker中取出/消费Message。
Producer 用于往Broker中发送/生产Message。
Replication Kafka支持以Partition为单位对Message进行冗余备份,每个Partition都可以配置至少1个Replication(当仅1个Replication时即仅该Partition本身)。
Leader 每个Replication集合中的Partition都会选出一个唯一的Leader,所有的读写请求都由Leader处理。其他Replicas从Leader处把数据更新同步到本地。
Broker Kafka中使用Broker来接受Producer和Consumer的请求,并把Message持久化到本地磁盘。每个Cluster当中会选举出一个Broker来担任Controller,负责处理Partition的Leader选举,协调Partition迁移等工作。
ISR In-Sync Replica,是Replicas的一个子集,表示目前Alive且与Leader能够“Catch-up”的Replicas集合。由于读写都是首先落到Leader上,所以一般来说通过同步机制从Leader上拉取数据的Replica都会和Leader有一些延迟(包括了延迟时间和延迟条数两个维度),任意一个超过阈值都会把该Replica踢出ISR。每个Leader Partition都有它自己独立的ISR。

1. 分析kafka源码的目的

  深入掌握kafka的内部原理

  深入掌握scala运用

2. server的启动

如下所示(本来准备用时序图的,但感觉时序图没有思维图更能反映,故采用了思维图):

2.1 启动入口Kafka.scala

从上面的思维导图,可以看到Kafka的启动入口是Kafka.scala的main()函数:

  def main(args: Array[String]): Unit = {<br/>
    try {<br/>
      val serverProps = getPropsFromArgs(args)<br/>
      val kafkaServerStartable = KafkaServerStartable.fromProps(serverProps)

      // attach shutdown handler to catch control-c<br/>
      Runtime.getRuntime().addShutdownHook(new Thread() {<br/>
        override def run() = {<br/>
          kafkaServerStartable.shutdown<br/>
        }<br/>
      })

      kafkaServerStartable.startup<br/>
      kafkaServerStartable.awaitShutdown<br/>
    }<br/>
    catch {<br/>
      case e: Throwable =><br/>
        fatal(e)<br/>
        System.exit(1)<br/>
    }<br/>
    System.exit(0)<br/>
  }

  上面代码主要包含:

    从配置文件读取kafka服务器启动参数的getPropsFromArgs()方法;

    创建KafkaServerStartable对象;

    KafkaServerStartable对象在增加shutdown句柄函数;

    启动KafkaServerStartable的starup()方法;

    启动KafkaServerStartable的awaitShutdown()方法;

2.2 KafkaServer的包装类KafkaServerStartable

  private val server = new KafkaServer(serverConfig)

  def startup() {<br/>
    try {<br/>
      server.startup()<br/>
    }<br/>
    catch {<br/>
      case e: Throwable =><br/>
        fatal("Fatal error during KafkaServerStartable startup. Prepare to shutdown", e)<br/>
        // KafkaServer already calls shutdown() internally, so this is purely for logging & the exit code<br/>
        System.exit(1)<br/>
    }<br/>
  }

2.3 具体启动类KafkaServer

KafkaServer启动的代码层次比较清晰,加上注释,看懂基本没有问题:

 /**<br/>
   * Start up API for bringing up a single instance of the Kafka server.<br/>
   * Instantiates the LogManager, the SocketServer and the request handlers - KafkaRequestHandlers<br/>
   */<br/>
  def startup() {<br/>
    try {<br/>
      info("starting")

      if(isShuttingDown.get)<br/>
        throw new IllegalStateException("Kafka server is still shutting down, cannot re-start!")

      if(startupComplete.get)<br/>
        return

      val canStartup = isStartingUp.compareAndSet(false, true)<br/>
      if (canStartup) {<br/>
        metrics = new Metrics(metricConfig, reporters, kafkaMetricsTime, true)

        brokerState.newState(Starting)

        /* start scheduler */<br/>
        kafkaScheduler.startup()

        /* setup zookeeper */<br/>
        zkUtils = initZk()

        /* start log manager */<br/>
        logManager = createLogManager(zkUtils.zkClient, brokerState)<br/>
        logManager.startup()

        /* generate brokerId */<br/>
        config.brokerId =  getBrokerId<br/>
        this.logIdent = "[Kafka Server " + config.brokerId + "], "

        socketServer = new SocketServer(config, metrics, kafkaMetricsTime)<br/>
        socketServer.startup()

        /* start replica manager */<br/>
        replicaManager = new ReplicaManager(config, metrics, time, kafkaMetricsTime, zkUtils, kafkaScheduler, logManager,<br/>
          isShuttingDown)<br/>
        replicaManager.startup()

        /* start kafka controller */<br/>
        kafkaController = new KafkaController(config, zkUtils, brokerState, kafkaMetricsTime, metrics, threadNamePrefix)<br/>
        kafkaController.startup()

        /* start kafka coordinator */<br/>
        consumerCoordinator = GroupCoordinator.create(config, zkUtils, replicaManager)<br/>
        consumerCoordinator.startup()

        /* Get the authorizer and initialize it if one is specified.*/<br/>
        authorizer = Option(config.authorizerClassName).filter(_.nonEmpty).map { authorizerClassName =><br/>
          val authZ = CoreUtils.createObject[Authorizer](authorizerClassName)<br/>
          authZ.configure(config.originals())<br/>
          authZ<br/>
        }

        /* start processing requests */<br/>
        apis = new KafkaApis(socketServer.requestChannel, replicaManager, consumerCoordinator,<br/>
          kafkaController, zkUtils, config.brokerId, config, metadataCache, metrics, authorizer)<br/>
        requestHandlerPool = new KafkaRequestHandlerPool(config.brokerId, socketServer.requestChannel, apis, config.numIoThreads)<br/>
        brokerState.newState(RunningAsBroker)

        Mx4jLoader.maybeLoad()

        /* start dynamic config manager */<br/>
        dynamicConfigHandlers = Map[String, ConfigHandler](ConfigType.Topic -> new TopicConfigHandler(logManager),<br/>
                                                           ConfigType.Client -> new ClientIdConfigHandler(apis.quotaManagers))

        // Apply all existing client configs to the ClientIdConfigHandler to bootstrap the overrides<br/>
        // TODO: Move this logic to DynamicConfigManager<br/>
        AdminUtils.fetchAllEntityConfigs(zkUtils, ConfigType.Client).foreach {<br/>
          case (clientId, properties) => dynamicConfigHandlers(ConfigType.Client).processConfigChanges(clientId, properties)<br/>
        }

        // Create the config manager. start listening to notifications<br/>
        dynamicConfigManager = new DynamicConfigManager(zkUtils, dynamicConfigHandlers)<br/>
        dynamicConfigManager.startup()

        /* tell everyone we are alive */<br/>
        val listeners = config.advertisedListeners.map {case(protocol, endpoint) =><br/>
          if (endpoint.port == 0)<br/>
            (protocol, EndPoint(endpoint.host, socketServer.boundPort(protocol), endpoint.protocolType))<br/>
          else<br/>
            (protocol, endpoint)<br/>
        }<br/>
        kafkaHealthcheck = new KafkaHealthcheck(config.brokerId, listeners, zkUtils)<br/>
        kafkaHealthcheck.startup()

        /* register broker metrics */<br/>
        registerStats()

        shutdownLatch = new CountDownLatch(1)<br/>
        startupComplete.set(true)<br/>
        isStartingUp.set(false)<br/>
        AppInfoParser.registerAppInfo(jmxPrefix, config.brokerId.toString)<br/>
        info("started")<br/>
      }<br/>
    }<br/>
    catch {<br/>
      case e: Throwable =><br/>
        fatal("Fatal error during KafkaServer startup. Prepare to shutdown", e)<br/>
        isStartingUp.set(false)<br/>
        shutdown()<br/>
        throw e<br/>
    }<br/>
  }

2.3.1 KafkaScheduler

KafkaScheduler是一个基于java.util.concurrent.ScheduledThreadPoolExecutor的scheduler,它内部是以前缀kafka-scheduler-xx的线程池处理真正的工作。

注意xx是线程序列号。

/**<br/>
 * A scheduler based on java.util.concurrent.ScheduledThreadPoolExecutor<br/>
 *<br/>
 * It has a pool of kafka-scheduler- threads that do the actual work.<br/>
 *<br/>
 * @param threads The number of threads in the thread pool<br/>
 * @param threadNamePrefix The name to use for scheduler threads. This prefix will have a number appended to it.<br/>
 * @param daemon If true the scheduler threads will be "daemon" threads and will not block jvm shutdown.<br/>
 */<br/>
@threadsafe<br/>
class KafkaScheduler(val threads: Int,<br/>
                     val threadNamePrefix: String = "kafka-scheduler-",<br/>
                     daemon: Boolean = true) extends Scheduler with Logging {<br/>
  private var executor: ScheduledThreadPoolExecutor = null<br/>
  private val schedulerThreadId = new AtomicInteger(0)

  override def startup() {<br/>
    debug("Initializing task scheduler.")<br/>
    this synchronized {<br/>
      if(isStarted)<br/>
        throw new IllegalStateException("This scheduler has already been started!")<br/>
      executor = new ScheduledThreadPoolExecutor(threads)<br/>
      executor.setContinueExistingPeriodicTasksAfterShutdownPolicy(false)<br/>
      executor.setExecuteExistingDelayedTasksAfterShutdownPolicy(false)<br/>
      executor.setThreadFactory(new ThreadFactory() {<br/>
                                  def newThread(runnable: Runnable): Thread =<br/>
                                    Utils.newThread(threadNamePrefix + schedulerThreadId.getAndIncrement(), runnable, daemon)<br/>
                                })<br/>
    }<br/>
  }

2.3.2 zk初始化

zk初始化主要完成两件事情:

    val zkUtils = ZkUtils(config.zkConnect,<br/>
                          config.zkSessionTimeoutMs,<br/>
                          config.zkConnectionTimeoutMs,<br/>
                          secureAclsEnabled)<br/>
    zkUtils.setupCommonPaths()

一个是连接到zk服务器;二是创建通用节点。

通用节点包括:

  // These are persistent ZK paths that should exist on kafka broker startup.<br/>
  val persistentZkPaths = Seq(ConsumersPath,<br/>
                              BrokerIdsPath,<br/>
                              BrokerTopicsPath,<br/>
                              EntityConfigChangesPath,<br/>
                              getEntityConfigRootPath(ConfigType.Topic),<br/>
                              getEntityConfigRootPath(ConfigType.Client),<br/>
                              DeleteTopicsPath,<br/>
                              BrokerSequenceIdPath,<br/>
                              IsrChangeNotificationPath)

2.3.3 日志管理器LogManager

  LogManager是kafka的子系统,负责log的创建,检索及清理。所有的读写操作由单个的日志实例来代理。

  /**<br/>
   *  Start the background threads to flush logs and do log cleanup<br/>
   */<br/>
  def startup() {<br/>
    /* Schedule the cleanup task to delete old logs */<br/>
    if(scheduler != null) {<br/>
      info("Starting log cleanup with a period of %d ms.".format(retentionCheckMs))<br/>
      scheduler.schedule("kafka-log-retention",<br/>
                         cleanupLogs,<br/>
                         delay = InitialTaskDelayMs,<br/>
                         period = retentionCheckMs,<br/>
                         TimeUnit.MILLISECONDS)<br/>
      info("Starting log flusher with a default period of %d ms.".format(flushCheckMs))<br/>
      scheduler.schedule("kafka-log-flusher",<br/>
                         flushDirtyLogs,<br/>
                         delay = InitialTaskDelayMs,<br/>
                         period = flushCheckMs,<br/>
                         TimeUnit.MILLISECONDS)<br/>
      scheduler.schedule("kafka-recovery-point-checkpoint",<br/>
                         checkpointRecoveryPointOffsets,<br/>
                         delay = InitialTaskDelayMs,<br/>
                         period = flushCheckpointMs,<br/>
                         TimeUnit.MILLISECONDS)<br/>
    }<br/>
    if(cleanerConfig.enableCleaner)<br/>
      cleaner.startup()<br/>
  }

2.3.4 SocketServer

SocketServer是nio的socket服务器,线程模型是:1个Acceptor线程处理新连接,Acceptor还有多个处理器线程,每个处理器线程拥有自己的selector和多个读socket请求Handler线程。handler线程处理请求并产生响应写给处理器线程。

/**<br/>
   * Start the socket server<br/>
   */<br/>
  def startup() {<br/>
    this.synchronized {

      connectionQuotas = new ConnectionQuotas(maxConnectionsPerIp, maxConnectionsPerIpOverrides)

      val sendBufferSize = config.socketSendBufferBytes<br/>
      val recvBufferSize = config.socketReceiveBufferBytes<br/>
      val maxRequestSize = config.socketRequestMaxBytes<br/>
      val connectionsMaxIdleMs = config.connectionsMaxIdleMs<br/>
      val brokerId = config.brokerId

      var processorBeginIndex = 0<br/>
      endpoints.values.foreach { endpoint =><br/>
        val protocol = endpoint.protocolType<br/>
        val processorEndIndex = processorBeginIndex + numProcessorThreads

        for (i <- processorBeginIndex until processorEndIndex) {<br/>
          processors(i) = new Processor(i,<br/>
            time,<br/>
            maxRequestSize,<br/>
            requestChannel,<br/>
            connectionQuotas,<br/>
            connectionsMaxIdleMs,<br/>
            protocol,<br/>
            config.values,<br/>
            metrics<br/>
          )<br/>
        }

        val acceptor = new Acceptor(endpoint, sendBufferSize, recvBufferSize, brokerId,<br/>
          processors.slice(processorBeginIndex, processorEndIndex), connectionQuotas)<br/>
        acceptors.put(endpoint, acceptor)<br/>
        Utils.newThread("kafka-socket-acceptor-%s-%d".format(protocol.toString, endpoint.port), acceptor, false).start()<br/>
        acceptor.awaitStartup()

        processorBeginIndex = processorEndIndex<br/>
      }<br/>
    }

    newGauge("NetworkProcessorAvgIdlePercent",<br/>
      new Gauge[Double] {<br/>
        def value = allMetricNames.map( metricName =><br/>
          metrics.metrics().get(metricName).value()).sum / totalProcessorThreads<br/>
      }<br/>
    )

    info("Started " + acceptors.size + " acceptor threads")<br/>
  }

2.3.5 复制管理器

启动ISR过期线程

  def startup() {<br/>
    // start ISR expiration thread<br/>
    scheduler.schedule("isr-expiration", maybeShrinkIsr, period = config.replicaLagTimeMaxMs, unit = TimeUnit.MILLISECONDS)<br/>
    scheduler.schedule("isr-change-propagation", maybePropagateIsrChanges, period = 2500L, unit = TimeUnit.MILLISECONDS)<br/>
  }

2.3.6 kafka控制器

当kafka 服务器的控制器模块启动时激活,但并不认为当前的代理就是控制器。它仅仅注册了session过期监听器和启动控制器选主。

  def startup() = {<br/>
    inLock(controllerContext.controllerLock) {<br/>
      info("Controller starting up")<br/>
      registerSessionExpirationListener()<br/>
      isRunning = true<br/>
      controllerElector.startup<br/>
      info("Controller startup complete")<br/>
    }<br/>
  }

session过期监听器注册:

  private def registerSessionExpirationListener() = {<br/>
    zkUtils.zkClient.subscribeStateChanges(new SessionExpirationListener())<br/>
  }<br/>
    public void subscribeStateChanges(final IZkStateListener listener) {<br/>
        synchronized (_stateListener) {<br/>
            _stateListener.add(listener);<br/>
        }<br/>
    }

class SessionExpirationListener() extends IZkStateListener with Logging {
  this.logIdent = “[SessionExpirationListener on ” + config.brokerId + “], “
  @throws(classOf[Exception])
  def handleStateChanged(state: KeeperState) {
  // do nothing, since zkclient will do reconnect for us.
}

 

选主过程:

  def startup {<br/>
    inLock(controllerContext.controllerLock) {<br/>
      controllerContext.zkUtils.zkClient.subscribeDataChanges(electionPath, leaderChangeListener)<br/>
      elect<br/>
    }<br/>
  }

def elect: Boolean = {<br/>
    val timestamp = SystemTime.milliseconds.toString<br/>
    val electString = Json.encode(Map("version" -> 1, "brokerid" -> brokerId, "timestamp" -> timestamp))

   leaderId = getControllerID<br/>
    /*<br/>
     * We can get here during the initial startup and the handleDeleted ZK callback. Because of the potential race condition,<br/>
     * it's possible that the controller has already been elected when we get here. This check will prevent the following<br/>
     * createEphemeralPath method from getting into an infinite loop if this broker is already the controller.<br/>
     */<br/>
    if(leaderId != -1) {<br/>
       debug("Broker %d has been elected as leader, so stopping the election process.".format(leaderId))<br/>
       return amILeader<br/>
    }

    try {<br/>
      val zkCheckedEphemeral = new ZKCheckedEphemeral(electionPath,<br/>
                                                      electString,<br/>
                                                      controllerContext.zkUtils.zkConnection.getZookeeper,<br/>
                                                      JaasUtils.isZkSecurityEnabled())<br/>
      zkCheckedEphemeral.create()<br/>
      info(brokerId + " successfully elected as leader")<br/>
      leaderId = brokerId<br/>
      onBecomingLeader()<br/>
    } catch {<br/>
      case e: ZkNodeExistsException =><br/>
        // If someone else has written the path, then<br/>
        leaderId = getControllerID 

        if (leaderId != -1)<br/>
          debug("Broker %d was elected as leader instead of broker %d".format(leaderId, brokerId))<br/>
        else<br/>
          warn("A leader has been elected but just resigned, this will result in another round of election")

      case e2: Throwable =><br/>
        error("Error while electing or becoming leader on broker %d".format(brokerId), e2)<br/>
        resign()<br/>
    }<br/>
    amILeader<br/>
  }

 def amILeader : Boolean = leaderId == brokerId

2.3.7 GroupCoordinator

GroupCoordinator处理组成员管理和offset管理,每个kafka服务器初始化一个协作器来负责一系列组别。每组基于它们的组名来赋予协作器。

  def startup() {<br/>
    info("Starting up.")<br/>
    heartbeatPurgatory = new DelayedOperationPurgatory[DelayedHeartbeat]("Heartbeat", brokerId)<br/>
    joinPurgatory = new DelayedOperationPurgatory[DelayedJoin]("Rebalance", brokerId)<br/>
    isActive.set(true)<br/>
    info("Startup complete.")<br/>
  }

注意:若同时需要一个组锁和元数据锁,请务必保证先获取组锁,然后获取元数据锁来防止死锁。

2.3.8 KafkaApis消息处理接口

 /**<br/>
   * Top-level method that handles all requests and multiplexes to the right api<br/>
   */<br/>
  def handle(request: RequestChannel.Request) {<br/>
    try{<br/>
      trace("Handling request:%s from connection %s;securityProtocol:%s,principal:%s".<br/>
        format(request.requestObj, request.connectionId, request.securityProtocol, request.session.principal))<br/>
      request.requestId match {<br/>
        case RequestKeys.ProduceKey => handleProducerRequest(request)<br/>
        case RequestKeys.FetchKey => handleFetchRequest(request)<br/>
        case RequestKeys.OffsetsKey => handleOffsetRequest(request)<br/>
        case RequestKeys.MetadataKey => handleTopicMetadataRequest(request)<br/>
        case RequestKeys.LeaderAndIsrKey => handleLeaderAndIsrRequest(request)<br/>
        case RequestKeys.StopReplicaKey => handleStopReplicaRequest(request)<br/>
        case RequestKeys.UpdateMetadataKey => handleUpdateMetadataRequest(request)<br/>
        case RequestKeys.ControlledShutdownKey => handleControlledShutdownRequest(request)<br/>
        case RequestKeys.OffsetCommitKey => handleOffsetCommitRequest(request)<br/>
        case RequestKeys.OffsetFetchKey => handleOffsetFetchRequest(request)<br/>
        case RequestKeys.GroupCoordinatorKey => handleGroupCoordinatorRequest(request)<br/>
        case RequestKeys.JoinGroupKey => handleJoinGroupRequest(request)<br/>
        case RequestKeys.HeartbeatKey => handleHeartbeatRequest(request)<br/>
        case RequestKeys.LeaveGroupKey => handleLeaveGroupRequest(request)<br/>
        case RequestKeys.SyncGroupKey => handleSyncGroupRequest(request)<br/>
        case RequestKeys.DescribeGroupsKey => handleDescribeGroupRequest(request)<br/>
        case RequestKeys.ListGroupsKey => handleListGroupsRequest(request)<br/>
        case requestId => throw new KafkaException("Unknown api code " + requestId)<br/>
      }<br/>
    } catch {<br/>
      case e: Throwable =><br/>
        if ( request.requestObj != null)<br/>
          request.requestObj.handleError(e, requestChannel, request)<br/>
        else {<br/>
          val response = request.body.getErrorResponse(request.header.apiVersion, e)<br/>
          val respHeader = new ResponseHeader(request.header.correlationId)

          /* If request doesn't have a default error response, we just close the connection.<br/>
             For example, when produce request has acks set to 0 */<br/>
          if (response == null)<br/>
            requestChannel.closeConnection(request.processor, request)<br/>
          else<br/>
            requestChannel.sendResponse(new Response(request, new ResponseSend(request.connectionId, respHeader, response)))<br/>
        }<br/>
        error("error when handling request %s".format(request.requestObj), e)<br/>
    } finally<br/>
      request.apiLocalCompleteTimeMs = SystemTime.milliseconds<br/>
  }

我们以处理消费者请求为例:

 /**<br/>
   * Handle a produce request<br/>
   */<br/>
  def handleProducerRequest(request: RequestChannel.Request) {<br/>
    val produceRequest = request.requestObj.asInstanceOf[ProducerRequest]<br/>
    val numBytesAppended = produceRequest.sizeInBytes

    val (authorizedRequestInfo, unauthorizedRequestInfo) =  produceRequest.data.partition  {<br/>
      case (topicAndPartition, _) => authorize(request.session, Write, new Resource(Topic, topicAndPartition.topic))<br/>
    }

    // the callback for sending a produce response<br/>
    def sendResponseCallback(responseStatus: Map[TopicAndPartition, ProducerResponseStatus]) {

      val mergedResponseStatus = responseStatus ++ unauthorizedRequestInfo.mapValues(_ => ProducerResponseStatus(ErrorMapping.TopicAuthorizationCode, -1))

      var errorInResponse = false

      mergedResponseStatus.foreach { case (topicAndPartition, status) =><br/>
        if (status.error != ErrorMapping.NoError) {<br/>
          errorInResponse = true<br/>
          debug("Produce request with correlation id %d from client %s on partition %s failed due to %s".format(<br/>
            produceRequest.correlationId,<br/>
            produceRequest.clientId,<br/>
            topicAndPartition,<br/>
            ErrorMapping.exceptionNameFor(status.error)))<br/>
        }<br/>
      }

      def produceResponseCallback(delayTimeMs: Int) {

        if (produceRequest.requiredAcks == 0) {<br/>
          // no operation needed if producer request.required.acks = 0; however, if there is any error in handling<br/>
          // the request, since no response is expected by the producer, the server will close socket server so that<br/>
          // the producer client will know that some error has happened and will refresh its metadata<br/>
          if (errorInResponse) {<br/>
            val exceptionsSummary = mergedResponseStatus.map { case (topicAndPartition, status) =><br/>
              topicAndPartition -> ErrorMapping.exceptionNameFor(status.error)<br/>
            }.mkString(", ")<br/>
            info(<br/>
              s"Closing connection due to error during produce request with correlation id ${produceRequest.correlationId} " +<br/>
                s"from client id ${produceRequest.clientId} with ack=0\n" +<br/>
                s"Topic and partition to exceptions: $exceptionsSummary"<br/>
            )<br/>
            requestChannel.closeConnection(request.processor, request)<br/>
          } else {<br/>
            requestChannel.noOperation(request.processor, request)<br/>
          }<br/>
        } else {<br/>
          val response = ProducerResponse(produceRequest.correlationId,<br/>
                                          mergedResponseStatus,<br/>
                                          produceRequest.versionId,<br/>
                                          delayTimeMs)<br/>
          requestChannel.sendResponse(new RequestChannel.Response(request,<br/>
                                                                  new RequestOrResponseSend(request.connectionId,<br/>
                                                                                            response)))<br/>
        }<br/>
      }

      // When this callback is triggered, the remote API call has completed<br/>
      request.apiRemoteCompleteTimeMs = SystemTime.milliseconds

      quotaManagers(RequestKeys.ProduceKey).recordAndMaybeThrottle(produceRequest.clientId,<br/>
                                                                   numBytesAppended,<br/>
                                                                   produceResponseCallback)<br/>
    }

    if (authorizedRequestInfo.isEmpty)<br/>
      sendResponseCallback(Map.empty)<br/>
    else {<br/>
      val internalTopicsAllowed = produceRequest.clientId == AdminUtils.AdminClientId

      // call the replica manager to append messages to the replicas<br/>
      replicaManager.appendMessages(<br/>
        produceRequest.ackTimeoutMs.toLong,<br/>
        produceRequest.requiredAcks,<br/>
        internalTopicsAllowed,<br/>
        authorizedRequestInfo,<br/>
        sendResponseCallback)

      // if the request is put into the purgatory, it will have a held reference<br/>
      // and hence cannot be garbage collected; hence we clear its data here in<br/>
      // order to let GC re-claim its memory since it is already appended to log<br/>
      produceRequest.emptyData()<br/>
    }<br/>
  }

对应kafka producer的acks配置:

The number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. The following settings are common:<br/>
acks=0 If set to zero then the producer will not wait for any acknowledgment from the server at all. The record will be immediately added to the socket buffer and considered sent. No guarantee can be made that the server has received the record in this case, and the retries configuration will not take effect (as the client won't generally know of any failures). The offset given back for each record will always be set to -1.<br/>
acks=1 This will mean the leader will write the record to its local log but will respond without awaiting full acknowledgement from all followers. In this case should the leader fail immediately after acknowledging the record but before the followers have replicated it then the record will be lost.<br/>
acks=all This means the leader will wait for the full set of in-sync replicas to acknowledge the record. This guarantees that the record will not be lost as long as at least one in-sync replica remains alive. This is the strongest available guarantee.

2.3.9 动态配置管理DynamicConfigManager

利用zookeeper做动态配置中心

/**<br/>
   * Begin watching for config changes<br/>
   */<br/>
  def startup() {<br/>
    zkUtils.makeSurePersistentPathExists(ZkUtils.EntityConfigChangesPath)<br/>
    zkUtils.zkClient.subscribeChildChanges(ZkUtils.EntityConfigChangesPath, ConfigChangeListener)<br/>
    processAllConfigChanges()<br/>
  }

  /**<br/>
   * Process all config changes<br/>
   */<br/>
  private def processAllConfigChanges() {<br/>
    val configChanges = zkUtils.zkClient.getChildren(ZkUtils.EntityConfigChangesPath)<br/>
    import JavaConversions._<br/>
    processConfigChanges((configChanges: mutable.Buffer[String]).sorted)<br/>
  }

  /**<br/>
   * Process the given list of config changes<br/>
   */<br/>
  private def processConfigChanges(notifications: Seq[String]) {<br/>
    if (notifications.size > 0) {<br/>
      info("Processing config change notification(s)...")<br/>
      val now = time.milliseconds<br/>
      for (notification <- notifications) {<br/>
        val changeId = changeNumber(notification)

        if (changeId > lastExecutedChange) {<br/>
          val changeZnode = ZkUtils.EntityConfigChangesPath + "/" + notification

          val (jsonOpt, stat) = zkUtils.readDataMaybeNull(changeZnode)<br/>
          processNotification(jsonOpt)<br/>
        }<br/>
        lastExecutedChange = changeId<br/>
      }<br/>
      purgeObsoleteNotifications(now, notifications)<br/>
    }<br/>
  }

2.3.10 心跳检测KafkaHealthcheck

心跳检测也使用zookeeper维持:

def startup() {<br/>
    zkUtils.zkClient.subscribeStateChanges(sessionExpireListener)<br/>
    register()<br/>
  }

  /**<br/>
   * Register this broker as "alive" in zookeeper<br/>
   */<br/>
  def register() {<br/>
    val jmxPort = System.getProperty("com.sun.management.jmxremote.port", "-1").toInt<br/>
    val updatedEndpoints = advertisedEndpoints.mapValues(endpoint =><br/>
      if (endpoint.host == null || endpoint.host.trim.isEmpty)<br/>
        EndPoint(InetAddress.getLocalHost.getCanonicalHostName, endpoint.port, endpoint.protocolType)<br/>
      else<br/>
        endpoint<br/>
    )

    // the default host and port are here for compatibility with older client<br/>
    // only PLAINTEXT is supported as default<br/>
    // if the broker doesn't listen on PLAINTEXT protocol, an empty endpoint will be registered and older clients will be unable to connect<br/>
    val plaintextEndpoint = updatedEndpoints.getOrElse(SecurityProtocol.PLAINTEXT, new EndPoint(null,-1,null))<br/>
    zkUtils.registerBrokerInZk(brokerId, plaintextEndpoint.host, plaintextEndpoint.port, updatedEndpoints, jmxPort)<br/>
  }

3. 小结

kafka中KafkaServer类,采用门面模式,是网络处理,io处理等得入口.

ReplicaManager    副本管理

KafkaApis    处理所有request的Proxy类,根据requestKey决定调⽤用具体的handler

KafkaRequestHandlerPool 处理request的线程池,请求处理池  <– num.io.threads io线程数量

LogManager    kafka文件存储系统管理,负责处理和存储所有Kafka的topic的partiton数据

TopicConfigManager  监听此zk节点的⼦子节点/config/changes/,通过LogManager更新topic的配置信息,topic粒度配置管理,具体请查看topic级别配置

KafkaHealthcheck 监听zk session expire,在zk上创建broker信息,便于其他broker和consumer获取其信息

KafkaController  kafka集群中央控制器选举,leader选举,副本分配。

KafkaScheduler  负责副本管理和日志管理调度等等

ZkClient         负责注册zk相关信息.

BrokerTopicStats  topic信息统计和监控

ControllerStats          中央控制器统计和监控

参考文献

【1】https://zqhxuyuan1.gitbooks.io/kafka/content/chapter1-intro.html